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I. 



INTRODUCTION 



This book is concerned with numerical integration in general p dimensional spaces. To under- 
stand why special methods are needed, let us consider for the moment trapezoidal or center-of-bin 
integration on the unit interval in p = 1 dimension. Since these are both second order methods, 
to achieve an accuracy of one part in 10 one needs a division of the unit interval into roughly 
100 subdivisions, with an evaluation of the integrand function at each. This poses no problem for 
numerical evaluation, but suppose instead we wish to integrate a function over a 9 dimensional 
region, achieving a similar accuracy of one part in 10 4 . One then needs 100 divisions per axis, and 
(100) 9 = 10 18 function evaluations, which is a daunting task even for the fastest current comput- 
ers. So a brute force extension of the trapezoidal rule (or similar higher order methods, such as 
Simpson's rule) is not a viable approach when the dimension of the space p is more than around 
four. 

In consequence, methods for high dimensional spaces have focused on adaptive algorithms, in 
which function evaluations are concentrated in regions where the integrand is large and rapidly 
varying. Both Monte Carlo and deterministic algorithms have been proposed and widely used. 
Typically, they start from a base region, and then subdivide or refine on one to three or four sides 
along which the integrand is most rapidly varying. The process is then iterated, leading to finer 
subdivisions and an improved estimate of the integrand. Most authors, however, have considered 
it to be computationally prohibitive to proceed at each step by dividing the base region into 2 P 
subregions, so that the maximal length of each side is reduced by a factor of 2 at each step. Such a 
subdivision would allow localization of isolated integrand peaks in p dimensions, giving a method 
with the potential of achieving high accuracy for integrations in high dimensional spaces, and high 
resolution in applications such as template-based pattern recognition. 

The motivation for this book is the observation that computer speed has dramatically increased 
in recent years, while the cost of memory has simultaneously dramatically decreased; our current 
laptop speeds, and memories, are characterized by "giga" rather than the "mega" of two decades 
ago. So it is now timely to address the problem of formulating practical high dimension integration 
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routines that proceed by 2 P subdivision. We will develop methods for adaptive integration over 
both general simplexes, and axis-parallel hypercubes. Our simplex method is based on combining 
Moore's (1992) algorithm for 2 P subdivision of a general simplex, with new formulas for parame- 
terized higher order integration over a general simplex that we derive using the centroid approach 
of Good and Gaskins (1969, 1971), to give a a fully localizable adaptive integration procedure 
for general dimension p > 1. In addition to giving a hypercube method based on partition into 
simplexes, we also give a simpler, direct method for integration over hypercubes, constructed by 
analogy with our methods for simplexes. We focus specifically on a few special base region geome- 
tries: the standard simplex (relevant for calculating Feynman parameter integrals in physics), the 
Kuhn simplex, which can be used to tile the p dimensional side 1 hypercube by symmetrization 
of the integrand, and the half-side 1 hypercube, which for which we give direct algorithms which 
are simpler than the simplex-based algorithms. By changes of variable, any multiple integral with 
fixed limits of integration in each dimension can be converted to an integration over the side 1 or 
half-side 1 hypercube. In the following sections we develop the theory behind our methods, and 
then give a suite of Fortran programs, for both serial and MPI parallel computation, implementing 
them. 



II. ONE DIMENSIONAL ADAPTIVE INTEGRATION 



As a simple example, let us sketch how to write an adaptive integration program in one dimen- 
sion for the integral 



Cdxf(x) . (1) 
J o 



A first estimate can be obtained by using the trapezoidal rule 

J o ~0.5[/(0) + /(l)] , (2) 

and a second estimate obtained by using the center-of-bin rule 

4-/(0.5) . (3) 

These are both first order accurate methods, but since they are applied to the entire interval (0, 1) 
there will be a significant error, unless f(x) happens to be a linear function over the interval. If 
we want an evaluation of the integral with an estimated error e, we test whether \I a — Ib\ < e. If 
this condition is satisfied, we output I a and 1^ as estimates of the integral. If the condition is not 
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satisfied, we subdivide the interval (0, 1) into two half-sized intervals (0,0.5) and (0.5, 1). In each 
subinterval we follow the same procedure. For a subinterval with upper limit xjj and lower limit 
xl, and midpoint xm, we now define 

I a ~ 0.5[/(^) + f(x L )) , (4) 

and 

h^f{x M ) ■ (5) 

For the two subintervals, we evaluate the trapezoidal and center-of-bin approximations to the inte- 
gral, keeping I a (subinterval) and 4 (subinterval) for the subinterval, multiplied by the subinterval 
width of 1/2, as contributions to the answer if the "thinning" condition 

| I a (subinterval) — lb (subinterval) | < e (6) 

is met, and subdividing the interval by half again if this condition is not met. When, after a sequence 
of subdivisions, the condition is met for all subintervals, we have obtained good approximations to 
both a trapezoidal and center-of-bin evaluation of the integral, 



I a ~ L (subinterval) I a (subinterval) , 

subintervals 

lb ~ L (subinterval) J& (subinterval) 



subintervals 

(7) 

Here L (subinterval) is the subinterval length, and since the subintervals are a tiling of the interval 
(0, 1), we clearly have 

L (subinterval) = 1 (8) 

subintervals 

From the difference of I a and lb we get an estimate of the error, given by 

|outdiff| = \I a - 4| . (9) 
We can also compute the sum of the absolute values of the local subinterval errors, 

errsum = L (subinterval) \I a (subinterval) — i& (sub interval) | > |outdiff| . (10) 

subintervals 

When the condition \I a (subinterval) — 4 (subinterval) | < e is met for all subintervals, then errsum 
reduces, using Eq. ([8]), to 

errsum < e , (11) 
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and if all local subinterval errors have the same sign, then we have errsum = |outdiff|. 

If the process of subdivision has to be stopped before the condition \I a (subinterval) — 
lb (subinterval) | < e is satisfied for all subintervals, with the remaining subregion contributions 
added to I a and If, before the program terminates, then errsum will typically be larger than e. 
Such premature termination can happen for very irregular or singular functions, or if the parame- 
ter e is made too small, or if one subdivides without imposing the thinning condition of Eq. ([6]). 
But for smooth functions f(x) and thinning with attainable e the subdivision process will terminate 
quite rapidly. The reason is that both the trapezoidal and center-of-bin methods are accurate to 
first order with a second order error, and so the difference I a (subinterval) — lb (subinterval) scales 
as [L (subinterval)] 2 as the subinterval length L (subinterval) approaches zero. 

The adaptive integration method just sketched is easily programmed, and works well. One does 
not have to keep track of the relative location of the various subintervals, only of their starting and 
ending x values. Thus, one maintains a list of active subintervals, stored in any convenient order; 
when a subinterval is divided the two resulting halves are added to the list of active subintervals, 
while if a subinterval obeys the thinning condition , its contributions to I a , lb, and errsum are added 
to an accumulation register, and the subinterval is removed from the list of active subintervals. 

Even faster termination is obtained if Simpson's rule or an even higher order integration rule is 
used; see for example the Wikipedia article on the McKeeman (1962) adaptive Simpson rule, and 
references given there. The idea again is to compute two different evaluations of the integral over 
each subinterval, giving an error estimate that is used to determine whether to "harvest" the result 
at that level of subdivision, or to subdivide further. In generalizing to higher dimensional integrals, 
the same features persist: for each integration subregion, we evaluate a local thinning condition 
obtained from the difference of two alternative higher order integration rules. If the condition is 
obeyed, that subregion is "harvested" and deleted from the list of active subregions; if the condition 
is not obeyed, the subregion is further subdivided and the resulting smaller subregions are added 
to the active list. 

III. GENERALIZING TO HIGHER DIMENSIONS: SIMPLEXES AND HYPERCUBES. 

REVIEW OF PRIOR WORK. 

The first question to decide in generalizing to higher dimensions is the choice of base region 
geometry. There are two natural higher dimensional analogs of the one dimensional interval (0, 1). 
The first is the side 1 hypercube (0, 1) ® (0, 1) (g) ... <S> (0, 1), and the second is what we will term a 



FIG. 1: From left to right, the unit standard simplex, the side 1 hypercube, and the half-side 1 hypercube, 
in 2 dimensions. 

standard simplex with vertices (0,0,..., 0), (0,1,0,0 0), (0, 0, 1, 0, 0...., 0), (0, 0, 0, 1). We 

will also make use of the half-side 1 hypercube, spanning (—1, 1) <g> (—1, 1) (g> ... (g> (—1, 1). These 
three basic regions are illustrated, in two dimensions, in Fig. 1. 

Some simple geometric facts are important in setting a strategy. For a p dimensional simplex, 
the number of vertices is p + 1 and the number of sides connecting vertices is (p + l)p/2, both of 
which have polynomial growth. Thus, the indexing problem of keeping track of vertices which define 
active regions is relatively simple. For a p dimensional hypercube, the number of m-dimensional 
hypercubes on the boundary (see, e.g., the Wikipedia article on hypercubes) is 2 p ~ m p\/ {m\{p — 
m) !), and so the number of vertices {m = 0) is 2 P , and the number of sides connecting vertices 
(m = 1) is p2 p ~ 1 , both of which grow exponentially with p. Thus, if one labels hypercubes in 
terms of their vertices or sides, an exponentially growing index is required for large p. However, for 
the maximal boundary hypercube, with m = p — 1, the number given by the above formula is just 
2p, which again has polynomial, in fact linear, growth. (For example, a square has 2x2 = 4 lines 
as sides, and a cube has 2x3 = 6 squares as faces.) So our direct method for hypercubes will use 
geometric features of the maximal boundary hypercubes for indexing, subdivision, and integration, 
closely following the methods that we develop for simplexes. 

Before getting into further details, let us first give a very brief survey of adaptive methods 
for higher dimensional integration that are currently in the literature. A method that is widely 
used by physicists to evaluate Feynman parameter integrals is the VEGAS program of Lepage 
(1978), which uses a hypercube as the base geometry. This is a Monte Carlo method, in which 
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random samplings of the integration volume are done with a separable probability density that is 
a product of one dimensional densities along each axis. This probability density is then iterated 
to give a more detailed sampling along axes on which the projection of the integrand is rapidly 
varying. A deterministic method of Genz and Cools (2003) is based on simplexes as the base 
regions. The algorithm picks the subregion with the largest estimated error, and subdivides it into 
up to four equal volume subregions by cutting edges along which the integrand is most rapidly 
varying. This, and related adaptive algorithms, are discussed in the survey of CUBPACK by Cools 
and Haegemans (2003). The CUBA set of algorithms described by Hahn (2005) includes both 
Monte Carlo methods and deterministic methods; the former include refinements of VEGAS and 
the latter proceed by bisection of the subregion with largest error. A survey of many types of high 
dimensional integration algorithms, including adaptive algorithms, is contained in the HIntLib 
Manual of Schiirer (2008). 

Most of the algorithms just described do not proceed directly to a 2 P subdivision of the base 
region (although the possibility of 2 P subdivision is sketched in "Algorithm 2" of Cools and Haege- 
mans (2003)). An algorithm in the literature which makes use of a 2 P subdivision was given by 
Kahaner and Wells (1979). Unlike the algorithms which we develop below, which work directly 
from the vertex coordinates of a general simplex, the Kahaner Wells algorithm uses changes of 
variable for both simplex subdivision and integration. It also rank orders the errors for each sub- 
region (as do most of the algorithms described in the preceding paragraph), and at each stage 
subdivides the subregion with the largest contribution to the total error. While this global method 
may result in efficiencies in reducing the number of subdivisions needed, it makes parallelization 
of the algorithm more complicated, since the computations for the different subregions are not 
independent of one another. Also, when many subregions have errors of similar size, which is often 
the case, the computational effort involved in rank ordering the errors may not be justified. In 
the algorithms developed below, as in the one-dimensional example given in Sec. 1, we use a local 
thinning condition for the subregions, making it easy to turn serial versions of the algorithm into 
parallel ones. We note, however, that the subdivision and integration methods that we use could 
also be incorporated into global adaptive algorithms. 

IV. SIMPLEX PROPERTIES AND APPLICATIONS 

Any set of p + 1 points in p dimensional space defines a p-simplex, and we will be concerned 
with integrations over the interior region defined this way. Thus, in 1 dimension, 2 points define a 
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1-simplex that is the line segment joining them, in 2 dimensions, 3 points define a 2-simplex that 
is a triangle, in three dimensions, 4 points define a 3-simplex that is a tetrahedron, and so forth. 
We will refer to the p + 1 points, that each define a p- vector, as the vertices of the simplex, and our 
strategy will be to express all operations, both for the subdivision of simplexes and for calculating 
approximations to integrals over simplexes, directly in terms of these vertices. Our convention, 
both here in the text and in the programs, is that the p + 1 vertices of a simplex are enumerated 
from to p, and the p vector components of each vertex are enumerated from 1 to p. Both will be 
denoted by subscripts; it should be clear from context and from the range of the index whether an 
index is the label of a simplex vertex, as in xq, ...,x p , or the component index of a general point 
x, as in x±, ...,x p . In this notation, the ith. component of the jth. simplex vertex is denoted by a 
double subscript Xji. 



A. Simplex properties 

A simplex forms a convex set. This means that for any integer n > 1 and any set of points 
xi,....,x n lying within (or on the boundary) of a simplex, and any set of non- negative numbers 
a±, ....,a n which sum to unity, 

aj > 0, j = l,...,n , 



a J = 1 



(12) 



the point 



n 



X 



3=1 

also lies within (or on the boundary) of the simplex (see, e.g., Osborne (2001)). 

In constructing integration rules for simplexes, we will be particularly interested in linear com- 
binations of the form of Eq. (113[) in which the points vertices of the simplex. For 
such sums, one can state a rule which determines precisely where the point x lies with respect to 
the boundaries of the simplex. Let xq,x±, ...,x p be the vertices of a simplex, and let x c denote the 
centroid of the simplex, 

p 
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Let us denote by Xj the coordinates of the vertices with respect to the centroid as origin, 

X j — ^ c i ( 

which obey the constraint following from Eq. (j!4[) . 

v 

E^ =0 • ( 16 ) 

3=0 

Correspondingly, let x denote a general point, and let X — X X £ denote the general point referred 
to the centroid as origin. Since we are assuming that the simplex is non-degenerate, the vectors Xj 
span a linearly independent basis for the p-dimensional space, and so we can always expand x as 
a linear combination of the Xj, 

v 

x = Y^ ajxj . (17) 

j=0 

This expansion is not unique, since by Eq. (1161) we can add a constant a to all of the coefficients 
ctj, without changing the sum in Eq. ()17p . In particular, we can use this freedom to put the 
expansion of Eq. (|17p in a standard form, which we will assume henceforth, in which the sum of 
the coefficients ay is unity, 

X>; = 1 • (18) 

j=0 

For coefficients (called barycentric coordinates) obeying this unit sum condition, we can use Eqs. 
(dU) and ([IS]) to also write 

p 

X = Y1 a i x i ■ ( 19 ) 

3=0 

In terms of the expansion of Eqs. (|1T|) through f|19j) we can now state a rule (see Pontryagin 
(1952) and the Wikipedia article on barycentric coordinates) for determining where the point x 
lies with respect to the simplex: (1) If all of the ay are strictly positive, the point lies inside the 
boundaries of the simplex; (2) If a coefficient otj is zero, the point lies on the boundary plane 
opposite to the vertex Xj, and if several of the ay vanish, the point lies on the intersection of 
the corresponding boundary planes; (3) If any coefficient ctj is negative, the point lies outside the 
simplex. 

To derive this rule, we observe that a point x lies within the simplex only if it lies on the same 
side of each boundary plane of the simplex as the simplex vertex opposite that boundary. Let us 
focus on one particular vertex of the simplex, which we label x p , so that the other p vertices are 
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xq, ...,x p -i. These p vertices span an affine hyperplane, which divides the p-dimensional space into 
two disjoint parts, and constitutes the simplex boundary hyperplane opposite the simplex vertex 
x p . A general parameterization of this hyperplane takes the form 

p-i 

x = x + ^2 Pi( x i ~ x o) > ( 20 ) 
i=l 

that is, we take xq as a fiducial point on the hyperplane and add arbitrary multiples of a complete 
basis of vectors Xj — xq in the hyperplane. Rewriting Eq. ([20]) as 

p-i 

x = J2^ x i ' ( 21 ) 

3=0 

with 7q = 1 — Y^j=i Pj an£ i Ij = Pji 3 ^ 1) we see that the p coefficients jj obey the condition 

p-i 

J> = 1 • (22) 

3=0 

By virtue of this condition, we can also write the hyperplane parameterization of Eq. (|2ip in terms 
of coordinates with origin at the simplex centroid, 

p-i 

X = Y1 W • (23) 

3=0 

We now wish to determine whether the general point x lies on the same side of this hyperplane 
as the vertex x p , or lies on the hyperplane, or lies on the opposite side from x p , by using the 
expansion of Eqs. (]17p and (|18|) . which we rewrite in the form 

p— l p— l 

X = CXjXj + Xj 
3=0 ^ 3=0 

P-I 

/ > x j a p x p ■ 

P 3=0 

(24) 

The first line on the right hand side of Eq. (|24p has the form of the hyperplane parameterization 
of Eq. (|23|) . since by construction the coefficients add up to unity, and so this part of the right 
hand side is a point on the boundary hyperplane opposite the vertex x p . The second line on the 
right hand side of Eq. (|24p can be rewritten, by using Eq. (|16p. as 

"p^-^p • (25) 



12 



To appreciate the significance of this, we note that the centroid of the p points defining the boundary 
hyperplane is 

p-i 

i 

Xh: 



1 P 1 

-^2 s 3 = --^p > ( 26 ) 



P£5 P 

where we have again used Eq. (|16p . Therefore the vector from the centroid of the points defining 
the hyperplane to the vertex x p is 

~ _ " - P±l~ (07\ 
Xp %h\c — Xp . \L I J 

P 

So Eq. (|25p tells us that the point x is displaced from the hyperplane by a vector parallel to that 
of Eq. (|27p . with its length rescaled by the factor a p . Therefore, if a p > 0, the point x, lies on 
the same side of the boundary hyperplane as the opposite vertex x p . If a p = 0, the point x lies 
on the boundary hyperplane, and if a p < 0, the point x lies on the opposite side of the boundary 
hyperplane from the vertex x p . Applying this argument to all p + 1 vertices in turn gives the rules 
stated above. 

In constructing integration rules for simplexes, we will use the following elementary corollary 
of the result that we have just derived. Consider the sum 

N 



X = Y J Kx l , (28) 



with the coefficients obeying 



A, > 0, i = l,...,N 

N 

5>,<1 , 

i=l 

(29) 

with the points Xi any vertices of a simplex. Some vertices may be omitted, and some used more 
than once, in the sum of Eq. (|28p . Then the point X lies inside the simplex. To see this, we note 
that by adding a positive multiple of zero in the form of Eq. (|16p . the sum of Eq. (|28|) can be 
reduced to the form of Eqs. ()1T|) and (|18p . with all expansion coefficients aj strictly positive. By 
the rule stated above, this implies that the point X lies within the simplex. 

B. Simplex applications 

Our programs for p-dimensional integration make special use of two kinds of simplexes, the unit 
"standard simplex" introduced above, and the unit Kuhn simplex. In this subsection, we discuss 
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important applications of these two special types of simplexes. 
To recapitulate, the unit standard simplex has vertices given by 

x =(0,0,0,..,0) , 
x x =(1,0,0,..., 0) , 
x 2 =(0,1,0,..., 0) , 
x 3 =(0,0,l,0,...,0) , 



x p _i =(0,0,0, ...,0,1,0) , 
x p =(0,0,0, ...,0,0,1) . 

(30) 

It is bounded by axis-parallel hyperplanes Xj = 0, j = l,—,p and the diagonal hyperplane 
1 = x± + X2 + ... + x p . Thus, the integral of a function f(x±, x p ) over the standard simplex can 
be written as a multiple integral in the form 



r rl rl—xi rl—X\- 

j f(xi,...,x p )dxi...dx p = / dx\ / dx2 

J standard simplex JO JO JO 

(■\-X\-Xl- ...-X p -2 rl — XI — X2 — ■■■ — Xp—\ 

x / dxp-i / dx p f(x 1 ,...,Xp) .(31) 

JO JO 



An important physics application of this formula is the Feynman-Schwinger formula for com- 
bining perturbation theory denominators, 

' W T7i z z „J . z n rznn^T - (32) 



D Di...D p ' ./standard simplex [(1 ~ X ± - X 2 ~ ■■■ ~ X p )D + + ... + XpD p \P +1 

which can be proved inductively as follows. For p = 1, the Feynman-Schwinger formula reads 

dxi T7~i TFT FT!? > 



D D 1 J '[(l-x^Do + x^ ' 

which is easily verified by carrying out the integral. Assume now that this formula holds for 
dimension p. For p + 1, the formula asserts that 

2 rl /•l — Xi — X2 — ... — x p -i rl — x\ — X2 — ---~x p -\ — x p 



D D 1 ...D p D p+1 



rl rl — Xi—X2 — ... — Xp-i r 

--{p + 1) ! / dx\.... I dx p I 

JO Jo Jo 



dx p+ i 

JO 

1 



[(1 -X1-X2- ••• - Xp - x p+ i)D + xiDi + ... + XpDp + x p+1 Dp +1 }P+ 2 

(34) 
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Carrying out the integral over x p+ i, we get 



1 



D D 1 ...D p D p+1 



P ■ 



dx-i 



1 — X1—X2- 



-Xp-1 



dx 



P Dp+i — Dq 



[(1 " Ef=i Xi)D + xiDi + ... + x p D p ]P+i [(1 - Ef=i Xi)D p+ i + siDi + ... + x p D p ]P+l 



(35) 



But applying the induction hypothesis for p dimensions, the right hand side of this equation reduces 
to 



1 



D 



p+i 



Do 



1 



D D 1 ...D P D P+1 



(36) 
is usually 



DoD^Dp D p+1 D l ....D p 
which is the result to be proved. In the literature, numerical evaluation of Eq. 
accomplished by first making changes of variable to convert the simplex integral to an integral 
over a hypercube, and then using a hypercube-based program such as VEGAS. Using the methods 
developed below for direct evaluation of integrals over a standard simplex in arbitrary dimensions, 
the formula of Eq. (132ft can also be integrated numerically in its original simplex form. 
We next turn to the unit Kuhn (1960) simplex, which has the vertices given by 

x =(0,0,0,...,0) , 
x x =(1,0,0,..., 0) , 
x 2 =(1,1,0,..., 0) , 
x 3 =(l,l,l,0,...,0) , 



x p -i =(1,1,1,..., 1,1,0) , 
x p =(1,1,1,..., 1,1,1) , 

(37) 

and which defines a simplex in which 1 > x\ > X2 > x^.... > x p _i > x p . The unit Kuhn simplex in 
two dimensions is illustrated in Fig. 2. 

The integral of a function f{x\, ■■■,x p ) over a unit Kuhn simplex can be written as a multiple 
integral in the form 

f(xi,...,Xp)dxi...dx p = / dx\ / dx2 / dx^.... 

unit Kuhn simplex JO JO JO 



x / dx p -i / dx p f(xi,...,Xp) 
Jo Jo 



(38) 
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FIG. 2: The unit Kuhn simplex in 2 dimensions. 



i 
i 




FIG. 3: Kuhn simplex tiling of a unit hypercube in 2 dimensions. 
Consider now the integral of the function f(x±, x p ) over the unit hypercube, 

/ dxi / dx 2 .... / dx p -i / dx p f(xi,...,x p ) . (39) 
Jo Jo Jo Jo 

This hypercube can be partitioned into p ! regions, each congruent to the unit Kuhn simplex, by 

the requirement that in the region corresponding to the permutation P of the coordinate labels 

l,...,p, the coordinates are ordered according to xp(i) > £p(2) > X P(3)---- > x P(p-i) > x P(p)- This 

partitioning or tiling is illustrated for a square in Fig. 3, and for a cube in three dimensions in Fig. 

1 of Plaza (2007). 

Hence the integral of / over the unit hypercube is equal to the integral of the symmetrized 
function computed from /, integrated over the unit Kuhn simplex, 



/ dxi / dx2---- / dxp-i / dx p f(xi,...,Xp 
Jo Jo Jo Jo 



L 



) 

f{xp {1) ,...,x P{p) )dxi...dx p . (40) 

unit Kuhn simplex p] pcrmutations p 

We will use this equivalence to construct adaptive programs for integration over a unit hypercube, 
based on first reducing it, by symmetrization, to an integral over a unit Kuhn simplex, and then 
adaptively subdividing the Kuhn simplex to reduce the integration error as needed. 
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V. SIMPLEX SUBDIVISION AND PROPERTIES 
A. Simplex subdivision algorithms 

Two very simple algorithms for subdividing simplexes have been given in the computer graphics 
literature by Moore (1992). Let us denote the vertices of the starting simplex by xq, x p , each of 
which is a p- vector, and from these let us form the p- vectors V(k\, k 2 ) defined by 

V(ki,k 2 ) = -(x kl +x k2 ) , ki,k 2 = 0,...,p. (41) 

Thus, V(0,0) = xo, ^(0,1) = (l/2)(xo + x\) and so forth, so that the vectors V(k\,k 2 ) consist 
of the original simplex vertices, together with the midpoints of the original simplex edges. Let 
k = 0, 2 P — 1 be an index which labels the 2 P subsimplexes into which the original simplex is 
divided. Moore then gives two algorithms, which he terms recursive subdivision and symmetric 
subdivision, for determining the vertices to be assigned to the subsimplex labelled with k. Both 
make use of the binary representation of k, and of a function determined by this representation, 
the bitcount function b(k), which is the number of 1 bits appearing in the binary representation of 
k. 

The recursive subdivision algorithm proceeds as follows. As the vertex of the subsimplex 
labelled by k, take the vector V (b(k) , b(k)) , that is, k\ = k 2 = b(k). To get the other vertices, scan 
along the binary representation of k from right (the units digit) to left. For each encountered, 
add 1 to k 2 , and for each 1 encountered, subtract 1 from k±. The sequence of vectors V(ki,k 2 ) 
obtained this way gives the desired p + 1 vertices of the kth subsimplex. 

The symmetric subdivision algorithm proceeds as follows. As the vertex of the subsimplex 
labelled by k, take the vector V(0, b(k)), that is, k\ = 0, k 2 = b(k). To get the other vertices, scan 
along the binary representation of k from right (the units digit) to left. For each encountered, 
add 1 to k 2 , and for each 1 encountered, add 1 to k\. The sequence of vectors V{k\, k 2 ) obtained 
this way gives the desired p + 1 vertices of the fcth subsimplex. 

The application of these algorithms in the p = 2 case is illustrated in Tables I and II and Figs. 
4-7, and in the p = 3 case is illustrated in Tables III and IV, where the notations V^\k\,k 2 ) 
and x^ both refer to the jth vertex, j = 0, of the subdivided simplex labelled by the k in 
each row. After reviewing these tables, it should be easy to follow the Fortran program for the 
algorithms given later on. The standard Fortran library does not include a bitcount function, but 
it does include a function I BITS (I, POS, LEN), which gives the value of the substring of bits of 
length LEN, starting at position POS, of the argument I. Thus, IBITS(k,j, 1) gives the binary 



17 



i 
i 




FIG. 4: Recursive subdivision of a standard simplex. 
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i 




FIG. 5: Symmetric subdivision of a standard simplex. 

digit (0 or 1) at position j in the binary representation of k, which is all the information needed 
for the algorithm. 

B. Subdivision properties 

This subdivision algorithm has a number of properties that will be useful in applying it to p 
dimensional integration. 

1. As noted by Moore, the subdivided simplexes all have equal volume, equal to the initial 
simplex volume divided by 2 P . This follows from the fact that the general formula for the 
volume of a simplex with vertices xq, x\, x p is 

V = — | det(xi — x ,x 2 — x , ...,x p — x )\ . (42) 

Applying this to the vertices for the subdivided simplexes in Tables I-IV verifies this state- 
ment for p = 2,3, while a proof in the general case is given in Edelsbrunner and Grayson 
(2000). 

2. Again as noted by Moore, both the recursive and symmetric algorithms subdivide Kuhn 
simplexes into Kuhn simplexes, which however do not all have the same orientation, as il- 
lustrated in Fig. 6 and Fig. 7. This follows from the fact that Kuhn simplexes are a tiling 
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TABLE I: Recursive subdivision of a triangle (p = 2) 








h 


b(k) V^\k u k 2 ) V^\k u k 2 ) 


V^(k U k 2 ) X W X™ x (2) 






0= 


=00 





(0,0) 


(0,1) 


(0,2) 


xo ^(x + xi) \{x + x 2 ) 






1= 


=01 


1 


(1,1) 


(0,1) 


(0,2) 


X\ \(x + Xi) \{x + x 2 ) 






2= 


=10 


1 


(1,1) 


(1,2) 


(0,2) 


Xl \{x-i + x 2 ) \{x + X 2 ) 








=11 


2 


(2,2) 


(1,2) 


(0,2) 


x 2 \{x\ + x 2 ) i(x + x 2 ) 












TABLE II: Symmetric subdivision 


of a triangle (p = 2) 






k 


b(k) V^(k u k 2 ) V^(k u k 2 ) W(k u k 2 ) 


3,(0) x (i) x (2) 




0= 


=00 







(0,0) 


(0,1) 


(0,2) 


Xq \{xq + Xi) \(xq + 


x 2 ) 


1: 


=01 


1 




(0,1) 


(1,1) 


(1,2) 


\(xq+x{) Xx ^{xi + 


x 2 ) 


2= 


=10 


1 




(0,1) 


(0,2) 


(1,2) 


^(xq + xx) \{x a +x 2 ) \{xi + 


x 2 ) 


3= 


=11 


2 




(0,2) 


(1,2) 


(2,2) 


\(x + x 2 ) \{xi + x 2 ) x 2 





of hypercubes, which are divided into hypercubes by axis parallel planes that intersect the 
midpoints of the hypercube edges. Adding additional diagonal slices intersecting the mid- 
points of the hypercube edges gives Kuhn tilings of both the original and the subdivided 
hypercubes. However, as also noted by Moore, when the algorithms are applied to general 
simplexes, the resultant subdivided simplexes can have different shapes, and are not isomor- 
phic. For p = 2, Fig. 4 shows that recursive subdivision applied to the standard simplex 
leads to subsimplexes of different shapes, while Fig. 5 shows that symmetric subdivision 
applied to the standard simplex leads to subsimplexes that are all standard simplexes with 
dimension reduced by half. However, an examination of the vertices in Table IV shows that 
already at p = 3, symmetric subdivision of a standard simplex does not lead to subsimplexes 
that are all half size standard simplexes. For example, for k = 2 in Table IV, there are 
vertices \{x$ + X2) and \{x\ + X3), the edge joining which has length V3/2, whereas the 
maximum side length of a half size p = 3 standard simplex is a/2/2. 




FIG. 6: Recursive subdivision of a Kuhn simplex. 
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TABLE III: Recursive subdivision of a tetrahedron (p = 3) 





k 


b(k) V(°>(fci,fc 2 ) 


V^(k 1 ,k 2 ) 


V^(k u k 2 ) V^(k u k 2 ) 


x (0) 




3.(2) 


3.(3) 


0= 


=000 


o 
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x 
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3(2:0 + 
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1= 
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CO 31 


Xi 
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xi) l(x + 


2:2) 3(2:0 + X 3 ) 


2= 


=010 


1 


(LI) 
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(0,3) 


Xi 
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X2) \{XQ + 
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3= 


=011 


2 


(2,2) 
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(0,3) 


X2 


${xi + 


^2) 5(^0 + 


X2) 5(2:0 + x 3 ) 


4= 


=100 


1 


(1,1) 
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(0,3) 


Xi 
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X2) \{xi + 


x 3 ) 3(2:0+2:3) 


5= 


=101 


2 


(2,2) 


(1,2) 


(1,3) 


(0,3) 


X2 


|(zi + 


X2) \{xi + 


x 3 ) 3(2:0 + x 3 ) 


6= 


=110 


2 


(2,2) 


(2,3) 


(1,3) 


(0,3) 


X2 


i(x 2 + 


X3) 5(2:1 + 
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7= 


=111 


3 


(3,3) 


(2,3) 


(1,3) 


(0,3) 


X 3 


3(2:3 + 


£3) 3(^1 + 


2:3) 3(2:0+2:3) 



TABLE IV: Symmetric subdivision of a tetrahedron (p = 3) 



k b{k)V {tt \k 1 ,k2)V ( - 1 \k 1 ,k 2 )V ( * 2 \k 1 ,k2)V { V{k 1 ,k 2 ) x« x< 2 ) x (3) 
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(0,0) 


(0,1) 


(0,2) 


(0,3) 
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=001 
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(1,2) 


(1,3) 


l(x +2;i) xi i(xi + x 2 ) \{x\ + x 3 ) 


2= 


=010 


1 


(0,1) 


(0,2) 


(1,2) 


(1,3) 


l(x +xi) l(x + x 2 ) \{xi+x 2 ) \{x\ +x 3 ) 


3= 


=011 
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(0,2) 


(1,2) 


(2,2) 


(2,3) 


l(x +x 2 ) l(xi +x 2 ) x 2 \{x 2 +x 3 ) 


4= 


=100 


1 


(0,1) 


(0,2) 


(0,3) 


(1,3) 


i(x +Xl) i(x +X 2 ) 2-(x +X 3 ) i(xi+X 3 ) 
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(0,2) 


(0,3) 


(1,3) 


(2,3) 


3(2:0 +X 2 ) 3(2:0+2:3) 3(2:1+2:3) \{X 2 +X 3 ) 


7= 


=111 


3 


(0,3) 


(1,3) 


(2,3) 


(3,3) 


l(x +x 3 ) l(xi +x 3 ) \{x 2 +x 3 ) x 3 



3. An important question is whether the maximum side length of the subdivided simplexes 
decreases at each stage of subdivision. For Kuhn simplexes, the answer is immediate, since 
subdivision results in Kuhn simplexes of half the dimension. Since the longest side of a unit 
Kuhn simplex in dimension p has length y/p, after I subdivisions the maximum side length 




FIG. 7: Symmetric subdivision of a Kuhn simplex. 
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will be 

^ x n = VP/2 £ , (43) 

irrespective of whether recursive or symmetric subdivision is used. For standard simplexes, 
we can get an upper bound on the maximum side length by noting that a unit standard 
simplex on axes yi,...,y p is obtained from a unit Kuhn simplex on axes xi,...,x p by the 
linear transformation y p = x p , y p -\ = x p _i - x p , y p - 2 = x p - 2 - sc p _i, ...,y 2 = x 2 - x 3 , y\ = 
x\ — x 2 , since this maps the components of the Kuhn simplex vertices given in Eq. (|37p 
to the corresponding components of the standard simplex vertices given in Eq. (|30p . By 
linearity, this relation also holds between vertices of corresponding subdivided simplexes 
obtained from the initial unit standard and Kuhn simplexes by applying the same midpoint 
subdivision method (either symmetric or recursive) successively to each. Consequently, the 
length £ standard f an edge with components Ef of a subdivided standard simplex can be 
expressed in terms of the components Ef of the corresponding edge of the related Kuhn 
simplex by 

standard ^(E^) = []T>f - E f +l f + {E*)*)\ 

j=i i=i 
v 

<2[^(£f) 2 ]l =2L Kuhn . 
3=1 

(44) 

Thus the length ^ standard i§ bounded from above by twice the maximum length corresponding 
to a subdivided Kuhn simplex, and so 

C" < VP/I 1 ' 1 ■ (45) 

We have verified this inequality numerically for both the recursive and symmetric subdivision 
algorithms. The numerical results suggest that the symmetric subdivision algorithm is in 
fact a factor of 2 better than the bound of Eq. (|45p. so that 

L standard; symmetric < ^ ^ 

but we do not have a proof of this. We already see evidence of this difference between the 
symmetric and recursive algorithms in Tables III and IV. As noted above, from Table IV we 
saw that symmetric subdivision of a p = 3 standard simplex gives an edge of length v3/2, 
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and it is easy to see that this is the longest edge. However, from Table III we see that for 



4. The result of Eqs. (j4"5j) and (ji"6l) suggests the stronger conjecture, that after any number £ of 
symmetric (recursive) subdivisions of a standard simplex, the resulting subsimplexes each fit 
within a hypercube of side 1/2^ (1/2^ _1 ). A simple argument shows this to be true for £ = 1 
in any dimension p. Although we do not have a proof for general £, we will use this conjecture 
in certain of the algorithms constructed below. For Kuhn simplexes, an analogous statement 
with a hypercube of side 1/2^ is true for both symmetric and recursive subdivision, as noted 
above in the discussion preceding Eq. ((151) . 

5. Finally, we note that although the symmetric algorithm gives the same simplex subdivision 
after permutation of the starting vertices in dimension p = 2, as can be verified from Table 
II, it is not permutation symmetric in dimension p = 3, as can be verified from Table IV. For 
example, interchanging the labels and 1 in the k = 2 line of Table IV gives a set of vertices 
that is not in the table. This means that with symmetric (as well as recursive) subdivision in 
dimension p > 3, inequivalent simplex subdivisions can be generated by permuting the labels 
of the starting vertices. However, we have not incorporated this feature into our programs. 

The properties just listed show that the symmetric and recursive subdivision algorithms are 
well suited to adaptive higher dimensional integration. They are easily computable in terms of 
the vertex coordinates for a general simplex, and give subsimplexes of equal volume, so that it is 
not necessary to calculate a determinant to obtain the volume. Additionally, the bound on the 
maximum side length decreases as a constant times 1/2^ with increasing order of subdivision £, so 
that the application of high order integration formulas gives errors that decrease rapidly with £. 



We have discussed simplexes first, because as noted in Sec. Ill, our direct approach to hypercube 
integration will be based on following as closely as possible the methods that we develop for simplex 
integration. In our direct hypercube programs (i.e., the ones not based on tiling a side 1 hypercube 
with Kuhn simplexes), we will start from a half-side 1 hypercube with base region 



k = 5 there are vertices X2 and \ 
simplex, has length y/E/2. 



(x\ + X3), the edge joining which, for an initial standard 



VI. HYPERCUBE SUBDIVISION AND PROPERTIES 




(47) 
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This region has inversion symmetry around the origin, and consequently the only monomials that 
have non-vanishing integrals over this region are ones in which each coordinate appears with an even 
exponent, considerably simplifying the calculations needed to construct higher order integration 
rules. 

Since we restrict ourselves to axis-parallel hypercubes, only p + 1 real numbers are needed to 
uniquely specify a hypercube: the p coordinates of the centroid x c and the half-side length S. For 
example, for the region of Eq. (I47p . the centroid is x c = (0,0, ...,0) and the half-side is 1. Once 
we have adopted this labelling, we can give a very simple subdivision algorithm for hypercubes, 
constructed in direct analogy with Moore's simplex subdivision algorithms. 

The hypercube subdivision algorithm proceeds as follows. Start from a hypercube with centroid 
x c and half-side S, with sides parallel to the p unit axis vectors 

u x =(1,0,0,...,0) 
n 2 =(0,l,0,...,0) 



Vi =(0,0,. .,1,0) 
u p =(0,0,. .,0,1) . 

(48) 

To subdivide it into 2 P subhypercubes, take the new half-side as 5/2. To get the new centroids 
x c -k, labelled by k = 0, ...,2 P — 1, scan along the binary representation of k from right (the units 
digit) to left. Denoting the p digits in this representation by 1 < j < p, let us label the units 
digit as j = 1, the power of 2 digit as j = 2, the power of 4 digit as j = 3, and so forth. For all 
1 < j < P, if the j th digit is 0, add \Suj to x c , and if the jth digit is 1, add — \Su,j to x c . For 
each given k, this gives the centroid of the kth subhypercube. This algorithm is illustrated for the 
case of a cube (p = 3) in Table V. 

This algorithm is simpler than the ones for subdividing simplexes, since it only needs the Fortran 
IBITS function, but does not require subsequent computation of the bitcount function. It evidently 
has properties analogous to those of the simplex subdivision algorithms: each subhypercube has the 
same volume, equal to the original hypercube volume divided by 2 P , and every linear dimension of 
each subhypercube is a factor of 2 smaller than the corresponding linear dimension of the hypercube 
that preceded it in the subdivision chain. This latter implies that after £ subdivisions, the resulting 
subhypercubes all have dimension reduced by a factor 1/2 . 
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TABLE V: Subdivision of a cube of half-side 5 and centroid x c (p = 3) 

0=000 (5/2,5/2,5/2) 

1=001 (-5/2,5/2,5/2) 

2=010 (5/2,-5/2,5/2) 

3=011 (-5/2,-5/2,5/2) 

4=100 (5/2,5/2,-5/2) 

5=101 (-5/2,5/2,-5/2) 

6=110 (5/2,-5/2,-5/2) 
7=111 (-5/2,-5/2,-5/2) 

For a hypercube with centroid x c and half-side S, and for a general point x, let us define the 
coordinate relative to the centroid as x, = x — x c , as we did in the simplex case in Eq. (|15p . Consider 
now the set of 2p points Xj , j = 1, 2p defined by 

xt =(5,0,0,..,0) 
x 2 =(0,5,0,..,0) 



x p =(0,0,...,5) 
Xp+i =(-5,0,0, ...,0) 
ip+2 =(0, -5,0, ...,0) 



x 2p =(0,0, ...,-S) 

(49) 

These points are the centroids of the maximal boundary hypercubes, and will play a role in the 
direct hypercube algorithm analogous to that played by the simplex vertices in the simplex adaptive 
algorithm. For future use, we need the following result, analogous to that of Eqs. ([28]) and (f29|) 
for the simplex case. Consider the sum 

N 

X = J2kZi , (50) 
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with the coefficients Aj obeying 



Xi > 0, i = l,...,N 



N 



(51) 



with the points Xi any of the hypercube boundary points of Eq. (|49H . Some of these points may 
be omitted (in which case the corresponding coefficient Aj is 0), and some used more than once, 
in the sum of Eq. (|50p . Then the point X lies inside the hypercube. To see this, we note that the 
projection of X along any axis j is of the form Xj = S(/j,+ — //_), with fj,± each a sum of some 
subset of the coefficients Aj, and hence < n± < 1. Therefore —S< —Sfi- < Xj < < S for 
each axis component Xj, and thus X lies within the hypercube. This proof, again, is simpler than 
the corresponding result in the simplex case. 

VII. PARAMETERIZED HIGHER ORDER INTEGRATION FORMULAS FOR A 

GENERAL SIMPLEX 

We turn next to deriving higher order integration formulas for a general simplex, which are 
expressed directly in terms of the set of simplex vertices, and which involve parameters that can 
be changed to sample the function over the simplex in different ways. Two different choices of the 
parameters then give two different integration rules of the same order, which can be compared to 
give a local error estimate for use in adaptive integration. 

Since we want to derive integration rules up to ninth order in accuracy, we start from an 
expansion of a general function f(x) up to ninth order, with x, as before the p dimensional coordinate 
referred to the simplex centroid as origin. The expansion reads, 



We next need expressions for the integral of the monomials appearing in the expansion of Eq. (|52p 
over a general simplex with vertices Xq, x p . A general formula for these integrals has been given 
by Good and Gaskins (1969, 1971). They define m{y) as the generalized moment 





(52) 




simplex 




(53) 
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and show that m(u) is equal to the coefficient of t" 1 ...tp p in the expansion of 

Here W s is a double sum over ith components of the p + 1 simplex vertices labelled by a = 0, 
given by 



(55) 



a=0 i=l 

and V is the simplex volume. Good and Gaskins derive this formula by first transforming the 
original simplex to a standard simplex, followed by lengthy algebraic manipulations to express the 
resulting formula symmetrically in terms of standard simplex vertices. We give in Sec. VIII below 
a derivation that proceeds directly, and with manifest symmetry, from the vertices of the original 
simplex. 

To proceed to 9th order we need an expansion of the exponential in Eq. (154)) through 9th order. 
Since each W s is of degree s in the coordinates, the terms in this expansion are as follows: 



second order 
third order 
fourth order 
fifth order 
sixth order 
seventh order 
eighth order 
ninth order 



W2 

2 

Ws 
3 

IT + X 

w 2 w 3 w b 

6 5 
W$ W$ W 2 Wi W 6 
~48 + ~18 + 8 + ~6~ 
WjW 3 W 3 W 4 W 2 W 5 W 7 

24 + 12 + 10 + ~T 



_ 3 wjw A wj w 3 w 5 w 2 w 6 w 8 

36 32 32 15 12 8 



W 2 W? 



W$ 
384 

WfWs , W$ 
144 



W 2 W 3 W 4 WiW 5 W A W b W 3 W 6 W 2 W 7 
162 + 24 + 40 + 20 + 18 + 14 



W9 
9 



(56) 



We are interested in integrals of monomials of the form x^.-.Xi , with n ranging from 1 to 9. Good 
and Gaskins note that it suffices to consider the case in which all the indices ii,...,i n are distinct 
(which is always possible for p > n), since the combinatoric factors are such that this gives a result 
that also applies to the case when some of the component indices are equal, as must necessarily be 
the case when p < n. So we can take v \ = 1 , i = 1, n, and J2i v i = n i with n the order of the 
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monomial. We now can infer from Eq. (|56p the moment integrals 

If _ p\ 

— I dx\...dXpXi 1 ...Xi — —. ; r-:S n , (57) 

V Vsimplcx VP + n) \ 

with the quantities S n (with tensor indices suppressed) given in terms of tensors S^...^ defined by 
sums over the vertices, 

p 

Sj^.An = ^ ~]%jil---Zjin J (58) 

as follows: 

^3 =25*^42,3 

^4 — •5^112*^314 "I" ^ilis^i^iA *5ili4'-'i2*3 6*^41*21344 — ^4142 '-'4344 "I - 2 terms + 6<Sj 1 j 2 i3i4 

5s — 2(iS , j 1 j2*S , j 3 j 4 j 5 + 9 terms) + 24S'j 1 j 2 j 3 j 4 j 5 

<$6 =5 , j 1 j 2 5 , i 3 i 4 5j 5i6 + 14 terms + 4:(S ili2i:i Si 4 i s i 6 + 9 terms) + 6{S ili2 Si 3 i 4 i B i 6 + 14 terms) 
+1205"^ 4213241516 

<S 7 =2(5 il i 2 5'i 3 i 4 5 i5 i 6 j 7 + 104 terms) + l2(Si li2 i 3 Si 4 i 5i( .i 7 + 34 terms) 

+24(5"^ j 2 Si 3 i^i^iQi-j ~\~ 20 terms) + 7205 , j 1 j2j 3 i 4 j5igi7 
5s =Si 1 i 2 Si 3 i 4 S{ 5 i 6 Si 7 i 8 + 104 terms + 4(S , j 1) ; 2 5'j 3 j 4 j 5 5'i 6 i 7 i 8 + 279 terms) 
+6(5 , j 1 j 2 »S'i 3 i 4 'S'j 5 j 6 j 7 j g + 209 terms) 

421314 Si B i 6 i 7 i s + 34 terms) + / i8(Si 1 i 2 i 3 Si 4 i 5 i 6 i 7 i 8 + 55 terms) 
+120(5*^42 ^i^i 4,151^17 is, 27 terms) + 50405^ 12434445464748 

+ 1259 terms) + SiS^i^S^igSi 

7«849 + 279 terms) 

-\-^2{Si 1 i 2 Si 3 i 4 i 5 Si 6 i 7 i s ig + 1259terms) + 24(Si 1 i 2 Si 3 i 4 Si 5 i 6 i 7 i s i 9 + 377 terms) 
+144(5^2 

4344 , S'454647*849 + 125 terms) + 240( 1 S , , 1 i 2 i 3 S r j 4 i s i 6 i 7 i g i 9 + 83 terms) 
+720(5"^ j 2 5'j 3 j 4 j 5 jgj 7 j g jg +35 terms) + 403205'j 1 1243444546*74849 

(59) 



The rule for forming terms in Eq. (J59J) from those in Eq. (J56J) is this: for each W s in Eq. (|56p 
there is a tensor factor 5 with s indices, and the product of such factors appears repeated in all 
nontrivial index permutations, giving the "terms" not shown explicitly in Eq. (159p . The numerical 
coefficient is constructed from the denominator appearing in Eq. (|56p . multiplied by a numerator 
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consisting of a factor s! for each W s , and for each W™ an additional factor ml (that is, for Wj 71 
there is altogether a factor (s !) m m !). For example, a Wf in Eq. (|56|) gives rise to a numerator 
factor of (2 !) 3 3 ! = 48 in Eq. ()59[) . and a W2W3W4 in Eq. (|56p gives rise to a numerator factor of 
2 !3 !4 ! = 288 in Eq. (|59p . In each case, the product of this numerator factor, times the number 
of terms in the symmetrized expansion, is equal to n\. For example, 48 x 15 = 720 = 6!, and 
288 x 1260 = 362880 = 9!. 

Our next step is to combine Eqs. ([52]) . ([57]) . and ([59]) to get a formula for the integral of the 
function / over a general simplex, expressed in terms of its expansion coefficients. Since we will 
always be dealing with symmetrized tensors, it is useful at this point to condense the notation, 
by labelling the contractions of the expansion coefficients with the tensors S by the partition of n 
which appears. Thus, we will write 

Ci 1 i 2 Si 1 i 2 = C2 

-^iii2i3i4i5 ${31415 9 terms) — -^3+2 
Hi 1 i2igi4igigij\Sii 1 i2Si s i 4 Si 5 igiY + 104 terms) — H3+2+2 ; 

(60) 

and so forth. Since the partitions of n that are relevant only involve n > 2, a complete list of 
partitions that appear through ninth order is as follows: 

C 2 

D 3 

E 4, 2 + 2 

F 5, 3 + 2 

G 6, 4 + 2, 2 + 2 + 2, 

H 7, 5 + 2, 3 + 2 + 2, 

/ 8, 6 + 2, 4 + 2 + 2, 

J 9, 7 + 2, 5 + 2 + 2, 

(61) 



3 + 3 

4 + 3 

2 + 2 + 2 + 2, 5 + 3, 3 + 3 + 2, 4 + 4 

3 + 2 + 2 + 2, 4 + 3 + 2, 6 + 3, 5 + 4, 3 + 3 + 3 



Employing this condensed notation, we now get the following master formula for the integral of / 
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over a general simplex, 



If pi p\ 

— / dx 1 ...dx p f(x) = A + - — -^-7^2 + 7 — —ttt2D 3 

^./simplex (p + 2)! (p + 3)! 



+ (p ^ ! 6) , (120G 6 + 6G 4+2 + 4G 3+3 + G 2+2+2 ) 

+ (p ^ ! 7) , (720ff 7 + 24tf 5+2 + 12^4+3 + 2F 3+2+2 ) 
p! 

+ ^^ 8) , (5040/ 8 + 120/ 6+2 + 48/ 5 +3 + 36/ 4+4 + 6/ 4+2+2 + 4/ 3 +3+2 + / 2+2+2+2 ) 

+ . Pl n . , (40320 J 9 + 720J 7+2 + 240J 6+3 + 144J 5+4 + 24J 5+2+2 + 12J 4+3 + 2 
(p + 9) ! 

+8J3+3+3 + 2J3+ 2 + 2 + 2 ) 

+... . 

(62) 

Our procedure is now to match this expansion to discrete sums over the function / evaluated at 
points on the boundary or interior of the simplex. We will construct these sums using parameter 
multiples of the vertices of the simplex (in which the summation limits for a, b, c, d are to p for 
simplexes, and will be 1 to 2p later on when we apply these formulas to hypercubes), 

Ei(A)=£/(Ax ) , 0<A<1 

a 

X 2 (A, cr) = f(^ x a + crxb) , < A, cr , A + cr < 1 

a,b 

S 3 (A, cr, n) = ^ f(Xx a + axi, + fix c ) , < A, cr, /i , A + cr + ,[/ < 1 

a,6,c 

£ 4 (A, 0", /X, k) = f(^ x a + C^b + fJ-X c + KXrf) , < A, <7, /i, K , A + (T + ^ + K<l , 

(63) 

where the conditions on the parameters A, a, fi, k guarantee, by our discussion of simplex properties, 
that the points summed over do not lie outside the simplex. Clearly, once we have a formula for 
£4, we can get a formula for S3 by setting k = and dividing by p + 1 (which becomes 2p in the 
hypercube case); we can then get S 2 by further setting jjl = and dividing out another factor of 
p + 1, and so forth. Hence we only exhibit here the expansion of S 4 in terms of /(0) = A and the 
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+£ 2 (AV + cr 2 A 3 + AV + ^A 3 + AV + k 2 A 3 + a 2 fi 3 + ^ct 3 + ctV + k 2 ct 3 + fi 2 n 3 + «V)*< 



contractions C 2 , J3+2+2+2 appearing in Eq. ([62]) , Abbreviating £ = p + 1, we have 

S 4 = C 4 ^4 + £ 3 (A 2 + a 2 + ^ 2 + k 2 )C 2 + e 3 (A 3 + ct 3 + ^ + /t 3 )Z?3 + £ 3 (A 4 + ct 4 + // + k 4 )£ 4 
+2£ 2 (A 2 ct 2 + AV + AV + ct 2 // 2 + ctV + /iV)£fe +2 + £ 3 (A 5 + ct 5 + /i 5 + k 5 )F 5 

2 :'■ , ..2..3\tt 

3+2 

+^3 (A 6 + a 6 + ^ + K 6 )Gg + ^2(^2 + A 2 CT 4 + ^2 + ^4 + ^2 + ^4 + ^2 + ^4 

+ctV + ctV + fi 4 n 2 + /A 4 )G 4+2 + 6C(A 2 a 2 ^ 2 + A 2 ctV + A 2 /jV + CT 2 /i 2 k 2 )G 2+2+2 
+2£ 2 (A 3 ct 3 + A 3 // + A 3 /-t 3 + ctV 3 + ct 3 k 3 + fi 3 K 3 )G 3+3 + £ 3 (A 7 + ct 7 + // + k 7 )# 7 

^C 2 / \5„2 j_ \2„5 , \5,,2 i \2,,5 , \5„2 , \2 5 , 5 2 , 2 5 i _5 2 , _2 5 • 5 „2 ■ ,,2 „5\ rr 
+4 (Act +Act + A /Li + A /i + Ak + A k + a fi + ct /x + ct k + ct k + fi k + fi k )-H5+2 

+2^(A 3 ct 2 / u 2 + A 3 ctV + A 3 /j 2 k 2 + ct 3 A 2 / u 2 + ct 3 AV + ctV 2 k 2 + fi 3 X 2 a 2 + //AV + /x 3 ctV 

+k 3 A 2 ct 2 + K 3 A 2 /i 2 + K 3 CT 2 /i 2 )F 3+2+2 + ^ 2 (A 4 ct 3 + A 3 ct 4 + X 4 fi 3 + A 3 /. 4 + A 4 k 3 + A 3 k 4 + ctV 3 + CT 3 /i 4 

+ct 4 k 3 + ctV + a 3 + /uV)# 4+ 3 + c 3 (a 8 + ct 8 + / + k 8 )/ 8 + e 2 (A 6 CT 2 + a 2 ct 6 + Ay + Ay 

+AV + AV + ctV 2 + v 2 ^ + <t 6 k 2 + ctV + M V + fi 2 K 6 )I 6+2 + 24(A 4 ct 2 /x 2 + A 4 ctV + X 4 fi 2 K 2 

i 4\2 2 i 4\2 2 i 4 2 2, 4\2 2 , 4\2 2 , 4 2 2, 4\2 2 , 4\2 2 , 4 2 2\ j 
+CT A /i + CT A K + CT /i K + fi X a + fi X K + fi O K + K A CT + K X fi + K CT /X )/ 4+ 2+2 

+24A 2 ct 2 /i 2 k 2 / 2+2+2+2 + e 2 (A 5 CT 3 + A 3 ct 5 + AV 3 + A 3 /i 5 + A 5 k 3 + A 3 k 5 + ct 5 /x 3 + + 

+ CT 3 K 5 + fi 5 K 3 + fi 3 K 5 )I 5+3 + 2£(A 2 CTy + A 2 CT 3 K 3 + X 2 fi 3 K 3 + CT 2 A 3 /U 3 + CT 2 A 3 K 3 + (J 2 fi 3 K 3 

+ft 2 X 3 a 3 + fi 2 X 3 K 3 + /xW + K 2 X 3 a 3 + K 2 X 3 ft 3 + K 2 a 3 fi 3 )h +2+2 + 2£ 2 (A 4 ct 4 + A 4 // + AV + ct 4 // 4 
+ctV + /i 4 K 4 )/ 4+4 + e 3 (A 9 + a 9 + fi 9 + k 9 )J 9 + e 2 (A 7 CT 2 + AV + AV 2 + A 2 /i 7 + AV + A 2 k 7 

,72, 27, 72, 27, 72, 2 7\t , o<V\5 2 2 , \ 5 2 2 , \ 5 2 2 , 5 \2 2 

+a fi + a fi + a k + a k + fi K + fi k JJ7+2 + 2^(A ct /i + Act k + A /j k + ct A fi 

5\2 2 1 5 2 2 1 5\2 2 , 5\2 2 , 5 2 2, 5 \2 2 , 5\2 2 , 5 2 2\ t 

+ct A k + a ft k + fi X ct + fi A « + ft a k +k X ct +« A /j + k ct /i JJ5+2+2 

, c /\3 222,\2 322,\2 232|\2 22 3\T , /-/\4 3 2 , \4 2 3 , \4 3 2 

+ 6(A CT fi K + A CT /i K + A CT /U K + A CT fi K J ^3+2+2+2 + ?(A CT /i + X <J fi + A CT K 

+A 4 ct 2 k 3 + X 4 fi 3 K 2 + X 4 K 2 fi 3 + a 4 X 3 fi 2 + a 4 X 2 fi 3 + ct 4 X 3 k 2 + ct 4 A V + a 4 fi 3 K 2 + a 4 fi 2 K 3 
+fi 4 X 3 a 2 + fi 4 X 2 a 3 + fi 4 X 3 K 2 + fi 4 X 2 K 3 + fi 4 a 3 K 2 + fi 4 a 2 K 3 + k 4 A 3 ct 2 + k 4 A 2 ct 3 + ^A 3 ^ 2 
+K 4 X 2 fi 3 + K 4 a 3 fi 2 + n 4 a 2 fi 3 ) J 4+3 +2 + £ 2 (A 6 ct 3 + A 3 ct 6 + A 6 /i 3 + A 3 ^ 6 + A 6 k 3 + A 3 k 6 + CT 6 /i 3 + 

, ^6^3 , ^.3^6 , ,,6^3 , ,,3..6\ t , c 2^5 4 , ^4^.5 , \5,,4 , a4,,5 , \5 4 , \4^5 , „5 , ,4 , 4, ,5 
+ CT K +CTK + fi K + fi K ) Je+3 +4 (ACT +ACT + A/i +A/i + AK + AK +CT/J, +CT/i 

+ CT 5 K 4 + CT 4 K 5 + /iV + /k 5 ) J 5 +4 + 6£(A 3 CTy + A 3 CT 3 K 3 + AVjU 3 + (J 3 fl 3 K 3 ) J3+3+3 • 



(64) 

Evidently, for simplexes Eq. ([63]) requires (p + l) 4 function evaluations to compute S 4 , (p+ l) 3 
function evaluations to compute S3, etc. For hypercubes, with the simplex vertices replaced by the 



30 



2p points of Eq. (09]), and £ = 2p, Eq. ([63]) requires (2p) 4 function evaluations to compute £4, (2p) 3 
function evaluations to compute S3, etc. Since it is known that the minimal number of function 
evaluations for a simplex integration method of order 2t + 1 involves p t /t ! + 0(p t ~ 1 ) function calls 
(Stroud (1971), Grundmann and Moller (1978)), and for a hypercube integration method of order 
2t + 1 involves (2p) t /t ! + 0(p t ~ 1 ) function calls (Lyness, 1965), we will take the leading Es in our 
integration formulas to have equal arguments, e.g. £4(A, A, A, A), Ss(A, A, A, A), etc. This allows 
the parameterized integration formulas constructed below to have an optimal leading order power 
dependence on the space dimension p (but reflecting the parameter freedom, the non-leading power 
terms will not in general be minimal). In the computer program, the following formulas are useful 
in evaluating the sums using a minimum number of function calls, 

S 4 (A,A,A,A) =24 f(Kxa + x b + 5; c + x d )) + 12^2 Y f(.2\i a + X(i b + x c )) 

a<b<c<d a b=£a,c^a,b<c 

+ 6 E E ^ 2X ^ + + 4 E E f( 3Xia + + E / ( 4Ai ») ' 

a b<a a b^a a 

S 3 (A, A, A) =6 Y f(M*a + x b + x c )) + 3 Y E ^ 2A ^ + A£fe ) + E /( 3A ^) . 

a<b<c a b=£a a 

S 2 (A, A) =2 Y E ^ A (^ + ^)) + E ^ 2X ^ > 
S 3 (2A,A,A) =2^ /(2Ax a + A(x b + 5 c )) + 2^^/(3Ax a + A5 fe ) 

a b^a, c^a, fe<c a bj^a 

+ 2 EE^( 2A (^ + ^)) + E^( 4A ^) > 

S 2 (3A, A) = J] /( 3A ^ + A ^) + E ^ 4A ^) ' 
S 2 (2A, A) = Y ^ 2A ^ + + E /( 3A **) ' 

(65) 

With these preliminaries in hand, we are now ready to set up integration formulas of first 
through fourth, fifth, seventh, and ninth order, for integrals over general simplexes. 

A. First through third order formulas 

We begin here with integration formulas of first through third order, which may be more useful 
than high order formulas for integrating functions that are highly irregular, or as explained later 
on, for integrations in low dimensional spaces. Two different first order accurate estimates of the 
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integral of Eq. (|62|) are clearly 

I a =£i(A)/£ = A + second order , 

i b =f(b) = A , 

(66) 

with x = the simplex centroid. Evidently 1^ is the dimension p analog of the dimension one 
center-of-bin rule, and when A = 1, I a is the dimension p analog of the dimension one trapezoidal 
rule. 

To get a second order accurate formula, we have to match the terms 

in Eq. ([62]). Solving Ei(A) = £A + \ 2 C 2 + ... for C 2 , we get 

C 2 ~ [£i(A)-64]/A 2 , (68) 
which when substituted into Eq. (j67|) gives the second order accurate formula 

/ =( 1 -(^^) / < 8 » + (?W^< A) • < 69 » 

Using two different parameter values X at b gives two different second order accurate estimates I a ^ 
of the integral. 

We give two different methods of getting a third order accurate formula, both of which will play 
a role in the methods for getting higher odd order formulas. We first note that for A = 2/(p + 3), 
we have 

S 1 (A)=^ + A 2 [C 2 + 2 J D 3 /(p + 3)] , (70) 

and so the coefficients of C 2 and D% are in the same ratio as appears in Eq. (|62[) , Hence defining 
an overall multiplicative factor k\ to make both terms match in magnitude, and adding a multiple 
kq of A to make this term match, we get a third order accurate formula 

I a =K^l (A) + K /(0) , 

pi 2 Op + 3) 2 

K l = 7 rr A — 



(p + 2)! 4(p + 2)(p + l) 

K =1 - C«l • 

(71) 
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An alternative method of getting a third order accurate formula is to look for a match by writing 

2 

/6=«o/(6)+2>i E i( A i) 



i=i 

2 



i=l 

(72) 

Equating coefficients of A we get 

2 

«o = l-^4 , (73) 

i=l 

while equating coefficients of C2 and D3, we obtain a system of two simultaneous equations for 

(74) 

where we have abbreviated 



qi (p + 2)l ' 92 (p + 3)! 
^= K i(Ai) 2 , i = 1,2 . 



This set of equations can be immediately solved to give 

A? qi - <?2 

w 2 — 



(75) 



(76) 



giving a second third order accurate formula for any nondegenerate A} and Af lying in the interval 
(0,1). We will see later on, in discussing higher odd order integration formulas, that this is our 
first encounter with a Vandermonde system of equations. 
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B. Fourth order formula 



Although we will subsequently focus on odd-order formulas, we next derive a fourth order 
formula, which follows a different pattern. Referring to Eq. (162p , we see that to get a fourth order 
formula we have to use the sums of Eq. (|63[) to match the coefficients of A, C 2 , -D3, E4, and -^2+2- 
Since only the final one of these, £"2+2 > involves two partitions of 4, we can use E2(A, A) to extract 
this, with any positive value of A < \. Since the simplex subdivision algorithm uses the midpoints 
\{x a + Xb) as the vertices of the subdivided simplex, an efficient way to proceed in this case is 
to take A = \ in £2 (A, A), so that what is needed is the function value at the midpoints, and to 
compute these function values as part of the subdivision algorithm. This also yields the function 
values at the vertices of the subdivided simplex. We can get A from f(x c ), and we can fit C 2 , 
D3, and E4 by evaluating Si(A) with three distinct values of A. One of these values can be taken 
as A = 1, corresponding to the function values at the simplex vertices. The other two are free 
parameters, and by making two different choices for one of these, we get two different fourth order 
evaluations of the integral. 

We worked out the fourth order program before proceeding systematically to the odd order 
cases, and so used a different notation from that of Eq. (|63j) . Let us write f c , f v , and f s for the 
sums of function values at the centroid, the vertices, and the side midpoints, 



fc =f(Xc) 



1 



p + i 



a 




(77) 



Let us also introduce, for n > m, the definition 



Pnm — 



(p + m){p + m + l)...(p + n) 



(78) 



A simple calculation then shows that through fourth order terms, we have 




k A + k 2 C 2 + k 3 D 3 + k A E A 



(79) 
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with coefficients given by 



k 


=1 - 


8(p + 1) 




P42 


k 2 


1 


4 








P21 


P42 


k 3 


2 


2 








P31 






6 


1 








P41 


P42 



(80) 

Defining now 

/a = -^E/( A ^) > ( 81 ) 

a 

so that /i = /„, we find that through fourth order, 

Ai — A 2 

*3 = f + x [/d - /c - T2 (Aj ~ /c)] > 3 = 1 , 2 , 
1 - Aj X- 

1 2 

i=i 

C2=(p+l)(A,-/c)--D 3 -S 4 • 

(82) 

When substituted into Eq. (|79|) . this gives a fourth order formula for the integral, with a second 
evaluation of the integral obtained by replacing A2 by a third, distinct value A3. 



C. Fifth order formula 



We turn next to deriving a fifth order formula. Referring to Eq. (|62p , we see that to get a fifth 
order formula we have to use the sums of Eq. (|63|) to match the coefficients of A, C2, -D3, E4, 
E2+2, F5, and ^3+2- Since at most two partitions appear, we can still get the leading two-partition 
terms from £2 (A, A), but we must now impose a condition on A to guarantee that £"2+2 and -F3+2 
appear with coefficients in the correct ratio. From Eq. ([620 we see that the ratio of the coefficient 
of -F3+2 to that of -E2+2 must be 2/ (p + 5), and from Eqs. (J63"|) and (|64|) with A = a and fj, = k = 0, 
we see that this is obtained with 
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which for any p > 1 obeys the condition 2 A < 1. The overall coefficient of £ 2 needed to fit E2+2 
and -F3+2 is easily seen to be 

k ~ Pl A~4- (P + 5 ) 4 (M) 

K2 -2l(^T4)T A ~-^T ' (84) 



where we have used the abbreviated notation of Eq. (|78[) . Thus we have, again from Eq. (|64f) . 

rc 2 E 2 (A,A) = p! £ 2+2 + 2 Pl F 3+2 + K 2 [g 2 ^ + g(2A 2 C 2 + 2A 3 D3 + 2A 4 ^4 + 2A 5 F 5 )] , (85) 
(p + 4) ! (p + 5) ! 

with £ = p + 1. Since there are four single partition terms, we look for an integration formula of 
the form 

4 

« 2 E 2 (A ! A) + ^4Ei(Ai) + K ^ , (86) 

i=l 

which is to be equated to the sum of terms through fifth order in Eq. (|62p . 

The equation for matching the coefficient of A can immediately be solved in terms of the 
coefficients n\, giving 

4 

k = 1-R , # = ^2 + £j>i • (87) 

i=i 

The four equations for matching the coefficients of C 2 , D3, E4, and F5 give aJV = 4 Vandermonde 
system that determines the four coefficients k\. Writing an order N Vandermonde system in the 
standard form 

N 

Y j x k l - 1 w i = q k , k = l,...,N , (88) 
i=i 

the equations determining the n\ take this form with 

Xi =X\ , Wi = k\(X\) 2 , 

1 2 

qi = 2£k 2 A , 

P21 

2 Q 

q 2 = 2£k 2 A j , 

P31 

6 

93 = 2£k 2 A 4 , 

P41 

24 K 

94 = 2^k 2 A 5 . 

P51 



(89) 

Solving this system of linear equations, for any nondegenerate values of the parameters < X\ < 1, 
gives the coefficients k\ and completes specification of the integration formula. 
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D. Vandermonde solvers 

Since we will repeatedly encounter Vandermonde equations in setting up parameterized higher 
order integration formulas, both for simplexes and for hypercubes, we digress at this point to 
discuss methods of solving a Vandermonde system. The explicit inversion of the Vandermonde 
system is well known (see, e.g. Neagoe (1996), Heinen and Niederjohn (1997)), and takes the form 

Qn - Si(x 2 , ...,x N )q N -i + S 2 (x 2 , ...,x N )q N - 2 - ■■■ + {-l) N ~ 1 x 2 ....x N qi 
Wl = - ( w r - ( v , (90) 

[Xl - X 2 ){Xl - X3)....(Xl - XN) 

with Sj(x 2 , ...,xn) the sum of j-fold products of x 2 , ...,xn, 
S 1 (x 2 , ...,x N ) =x 2 + ... + x N , 

S 2 (x 2 , ...,Xn) =X 2 Xs + ... + X 2 XN + X3X4 + ... + X3XN + ■■■ + XN-lXN , 

(91) 

and so forth. The remaining unknowns w 2 through wn are obtained from this formula by cyclic 
permutation of the indices i = 1,...,N on the Wi and the Xj, with the qk held fixed. For N 
not too large it is straightforward to program this solution, and we include subroutines for the 
N = 2, 3, 4, 6, 8 cases in the programs. This suffices to solve the Vandermonde equations appearing 
in the fifth through ninth order simplex formulas, and in the fifth through ninth order hypercube 
formulas derived below. 

For large N, programming the explicit solution becomes inefficient and a better procedure is to 
use a compact algorithm for solving the Vandermonde equations for general N, based on polynomial 
operations, which has running time proportional to N 2 . A good method of this type, that we have 
tested, is the algorithm vander.for given in the book Numerical Recipes in Fortran by Press et al. 
(1992). A similar algorithm for inverting the Vandermonde matrix (that we have not tested) can 
be found in an on-line paper of Dejnakarintra and Banjerdpongchai, searchable under the title "An 
Algorithm for Computing the Analytical Inverse of the Vandermonde Matrix" . 

E. Seventh order formula 

To get a seventh order formula, we use the sums of Eq. ([63]) to match the coefficients appearing 
in Eq. (162|) through the term H^ +2+2 . Since at most three partitions appear, we can get the leading 
three-partition terms G 2+2+2 and H^ +2+2 from ^(X, A, A) by imposing the condition 
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which guarantees that their coefficients are in the correct ratio, and which for any p > 1 obeys the 
condition 3 A < 1. The overall coefficient of S3 needed to fit G2+2+2 and #3+2+2 is 

K3 = JWTW • (93) 

We now look for an integration formula of the form 

U 6 

K3 S3(A,A,A) + 4S 2 (2A,A) + ^4S 2 (A 2 ,A l 2 )+^4E 1 (Ai) + K ^ , (94) 

i=\ i=l 

with U < 6 since there are 6 two-partition terms to be matched. Equating coefficients of the two- 
partition terms, we find that the equations for G4+2 — G3+3 and #5+2 — #4+3 are both automatically 
satisfied by taking 

k' 2 = 3k 3 . (95) 

This leaves only the two-partition terms -E2+2, ^3+2, G4+2, and #5+2 to be matched, so we can 
take the upper limit in the X2 summation as U = 4. The four coefficients k % 2 are then determined 
by solving an TV = 4 Vandermonde system with inhomogeneous terms g2j , i = 1, 4, 





—^2 j 


= 24 (A 2 ) 4 


?2i 


1 

P41 


- (6£ + 24)k 3 A 4 


«2 2 


2 

P51 


- (6£ + 36)k 3 A 5 


<?2 3 


6 

P6i 


- (6£ + 60)k 3 A 6 


«2 4 


24 
P71 


- (6£ + 108)k 3 A 7 



(96) 

We next have to match the 6 single partition terms, using Si sums. To save function calls, we 
take four of the parameters A^ to be equal to 2A 2 , with the other two A^ taken as new, independent 
parameters. Equating the coefficients of the single partition terms C 2 through H7 then gives a 
N = 6 Vandermonde system determining the coefficients n\, with inhomogeneous terms qli , i = 
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1,...,6, 



Xi =X\ , Wi = k\(X\) 2 
i 4 

qU = 2i V 4(A 2 ) 2 - (3£ 2 + 15C)A 2 k 3 , 

Tim ^— ' 



P21 



i=l 
1 



?1 2 ~ - 2^ 4(A|) 3 - (3£ 2 + 270A 3 k 3 , 
i=i 



P31 



6 

qh = e?2i - (3£ 2 + 510A 4 ^ 3 , 

P41 

24 

?l4 = e?2 2 - (3£ 2 + 99£)A 5 k 3 , 

P51 
120 

gl 5 = &2 3 - (3£ 2 + 1950A 6 k 3 , 



P61 
720 

gl 6 = £<z2 4 - (3£ 2 + 3870A 7 k 3 

P71 



(97) 



Note that in gl 3 , the sums 2^ j=1 4(^2)*' > .7 = 4, ...,7 have been eliminated in terms of 

g2i,...,g24 by using the Vandermonde system of Eq. ([96]) . Finally, matching the coefficient of A 
we get, using Eq. ([95]) 

4 6 

«o = i-i2o, J R = (e 3 + 3e 2 )^ 3 + e 2 ^4 + e^4 • (98) 

i=l i=l 

F. Ninth order formula 

To get a ninth order formula, we use the sums of Eq. (|63p to match the coefficients appearing 
in Eq. ([62]) through the final exhibited term J 3 +2+2+2- Since at most four partitions appear, we 
can get the leading four-partition terms «/ 3 +2+2+2 and ^2+2+2+2 from E^A, A, A, A) by imposing the 
condition 

A = — ^— , (99) 
P+9 ' V ' 

which guarantees that their coefficients are in the correct ratio, and which for any p > 1 obeys the 
condition 4 A < 1. The overall coefficient of £4 needed to fit J 3 +2+2+2 and / 2 +2+2+2 is 

K4 -4!(p + 8)! A -614W • ( ° 0) 
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We now look (with benefit of hindsight) for an integration formula of the form 

4 4 

k 4 S 4 (A, A, A, A) + 4S 3 (2A, A, A) + 4%(3A, A) + £ 4S 3 (A|, A 3 , A|) + £ sjEaM, A 3 ) 

i=l i=l 

6 4 4 

+ ]T 4s 2 (A i 2 , A 2 ) + J2 4Si(3A|) + £ «iSi(2A* 3 ) + k A , 

i=l i=l i=l 

(101) 

with four of the A 2 taken equal to the four A 3 , and the other two A 2 additional parameters. (Again, 
we reuse parameters wherever similar structures are involved in Eq. (|65p . so as to save function 
calls.) 

We proceed to sketch the remaining calculation, without writing down the detailed form of 
the resulting Vandermonde equations (which can be read off from the programs, and is fairly 
complicated). We begin with the three-partition terms. The equations for 75+2+2 — J4+3+2, 
J4+3+2 — ^3+3+3, and ^4+2+2 — -^3+3+2 are all automatically satisfied by taking 

4 = 6k 4 . (102) 

This leaves four independent matching conditions for G2+2+2, #3+2+2, I4+2+2, and J5+2+2, which 
lead to a N = 4 Vandermonde system determining the coefficients k\. We turn next to the two- 
partition terms. We find that the equations for Iq+2~ 4.5/5+3 + 3.5/4+4 and J7+2 — 3.5/6+3 + 2.5/5+4 
are automatically satisfied by taking 

4 = 8k 4 . (103) 

The four equations for G4+2 - G3+3, i/5+2 - #4+3, / 6+2 - I4+4, and / 7+2 + Je+3 - 2J5+4 then 
give a N = 4 Vandermonde system determining the coefficients k 1 2 - The remaining independent 
equations matching two-partition terms, for E2+2, -F3+2, C4+2, #5+2 , ^6+2, and J7+2, then give a 
N = 6 Vandermonde system determining the coefficients R 2 - 

Turning to the single partition terms, the eight equations obtained by matching coefficients for 
C2, -D3, £4, F5, Gq, H7, /§, and Jg give a N = 8 Vandermonde system determining simultaneously 
the four coefficients n\ and the four coefficients R\. Finally, equating coefficients of A gives 

4 4 6 4 

K0 = i-r, J R = (e 4 +6e 3 +8e 2 )K 4 +c 3 ^4+e 2 E4 + E4)+e^K + 4) • (104) 

i=l 8=1 i=l i=l 
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G. Leading term in higher order 

We have not systematically pursued constructing integration formulas of orders higher than 
ninth, but this should be possible by the same method. One can, however, see what the pattern 
will be for the leading term in such formulas. An integration formula of order 2t + 1 will have 
a leading term S^(A,...,A), with t arguments A. The only partition t terms appearing in the 
continuation of Eq. (162ft will be 2 + 2 + .... + 2, containing t terms 2, and 3 + 2 + ... + 2, with one 
3 and t — 1 terms 2. Requiring these to have coefficients in the correct ratio restricts A to be 

A = , (105) 

P + 2t + l ' V > 

and the leading term in the integration formula will be KfSf (A, A), with m given by 

Kt = t!( P + 2t)\\* ■ (106) 
Where nonleading terms give multiple equations of the same order, corresponding to inequiv- 
alent partitions of 2t + 1, 2t, one has to include terms proportional to S^_i(2A, A, A), 
St-2(3A, A, A), and other such structures with asymmetric arguments summing to tX, for the 
differences of these multiple equations to have consistent solutions. Once such multiplicities have 
been taken care of, the remaining independent equations will form a number of sets of Vandermonde 
equations. 

VIII. DERIVATION OF THE SIMPLEX GENERATING FUNCTION 



We give here a simple proof of the simplex generating function formulas of Eqs. (J54J) and (I55p . 
using the standard simplex integral 



/ 



, d Xl ...dx p (1 - xi - x 2 - ... - x p ) vo x?...x?' = j 1 ^ - ' , (107) 

/standard simplex VP ' Z-^a=0 " 

(which we obtain later on as a specialization of the multinomial beta function integral of Eq. 
(|168p ) . the simplex volume formula of Eq. (|42p . and the expansion formulas of Eqs. (|17j) through 
(fT9D . We start from 



d Xl ...dx p f; n j=i ( ^p = / d Xl ...dx p eZU^ 

.. _n 1 li=1 v i ■ ./simplex 



(108) 



. ^ L ...^. p ^ p 

/simplex ^...^=0 1 1 *= 1 * " 

and substitute the expansion of Eq. (fTTft on the right hand side, giving 

/ dx 1 ...dx p e^=o a «T, , i= 1 Xa l U _ ( 10 g) 

J simplex 
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We now express the integral over the general simplex in terms of an integral over its barycentric 
coordinates a a . Since X^a=o Q a = ^> we can rewr ite Eq. (fTUj) . by subtraction of xq from both sides, 
as 

p v 
x - x = ^(xa - x )a a = - x )a a . (110) 

a=0 a=l 

From this we immediately find for the Jacobian 

dai'da ) =l det ( iE "- a; o)«| = yp! , (HI) 

with V the volume of the simplex. Since the a a , a = 1, ...,p span a standard simplex, we have 
transformed the integral of Eq. (|109p to the form 

Vp\ I dax...da p e^=o aa ^i=i z ^ . (112) 

J standard simplex 

Expanding the exponential on the right in a power series, we have 

: f d ai ...da p f; nLo(»ar°(ELi^t,)^ ^ (113) 

J standard simplex ui...u v =0 lla= 

and then recalling that oq = 1 — Y^a=i a ai an d using Eq. (|107f) to evaluate the integral over the 
standard simplex, we get 

Let us now define P n as the projector on terms with a total of n powers of the parameters ti, 
since this is the projector that extracts the nth order moments. Applying P n to Eq. (|114p . the 
denominator is converted to (p + n) !, which can then be pulled outside the sum over the 
permitting these sums to be evaluated as geometric series, 

P T/nl \" ria=o(Z^=l x aitiY a 
rr i OO p p 

p " e new* 



(p + n) ! 

v ^ ; t/i...Vp=0a=0 i=l 



(p + n) ! 

v ^ ' a=0 i=l 



(115) 



Finally, applying to each factor in the product over a the rearrangement 

oo s 

(1-y)- 1 =exp[-log(l-y)] = exp[V ^] , (116) 



s=l 
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we get 



(p + n) 



P neX p[^^oi^i^!L] j (H7) 

where we have used the fact that the s = 1 term in the sum vanishes because ^ a x a i = 0. 
Comparing Eq. (j!17|) with the starting equation Eq. (|1Q8|) . we get Eqs. (}54"|) and ([55]) . 



IX. PARAMETERIZED HIGHER ORDER INTEGRATION FORMULAS FOR 

AXIS-PARALLEL HYPERCUBES 

We turn in this section to the problem of deriving higher order integration formulas for axis- 
parallel hypercubes, in analogy with our treatment of the simplex case. Our formulas can be 
viewed as a generalization of those obtained by Lyness (1965) and McNamee and Stenger (1967). 
We consider an axis-parallel hypercube of half-side S, and denote by x coordinates referred to the 
centroid of the hypercube. Through ninth order, the expansion of a general function f(x) is given 
as before by Eq. (|52p . Consider now the moment integrals 



m (^) = / dx\...dx p x V i ...x p p . (118) 

J hypercube 

Since the limits of integration for each axis are —S, S, the moment integral factorizes and can be 
immediately evaluated as 

P qu e +l 

m(v) =T\- -[1 + (-1)^1 



4 V * + 1 



=0 any vt odd , 

= 11 ^71 all ^ even 

P gut 

=V 1 f all vi even , 

(119) 

with V = (2S) P in the final line the hypercube volume. 

We now reexpress this moment integral in terms of sums over the set of 2p hypercube points 
Xj given in Eq. (|49p . which will play a role in hypercube integration analogous to that played by 
simplex vertices in our treatment of simplex integration. In analogy with Eq. (|58p . we define the 
sum 

2p 

Si x ...i n = 'y ^Xji l ...Xjj n . (120) 
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Since Xji = S5ji for 1 < j < p and Xji = —S5ji for p + 1 < j < 2p, this sum vanishes unless n is 
even and all of the indices ii,...,i n are equal, in which case it is equal to 2S n . The tensors of Eq. 
(|120p and their direct products form a complete basis on which we can expand moment integrals 
over the hypercube. We have carried out this calculation two different ways. First, by matching 
the non- vanishing moment integrals through eighth order, we find 



If 1 

— / dx\...dxpXi 1 Xi 2 — ~x^iit2 i 

* J hypercube " 

If 1 1 

77 / dxi...dxpXi 1 Xi 2 Xi 3 Xi 4 — — — (S^jj Si^i^ + Si^i^ Si 2 i 4 ~\~ S^i^S^i^) Tr^S^i 

V Jhypercube ^0 10 



: 2g {8i 1 i 2 Si 3 i A + 2 terms) ^Shi2i3U , 



v L 



if „„„„„„ i 

— — / dx\...dxpXi 1 Xi 2 Xi s Xi 4 Xi 5 Xi e — (.8i^i 2 Si 3 i 4 S^ig -|- 14 terms) 

' ./hypercube ^-L^ 

1 8 

— ^ (Siy^Sij^igi^ + 14terms) + g~7^<S'i 1 i2i3i4j5*6 j 

dx\...dxpXi 1 Xi 2 Xi 3 Xi 4 Xi^Xi e Xi 7 Xi s — 777777;(.Si 1 i 2 Si 3 i 4 Si 5 iQSi 7 ig + 104 terms) 

hypercube IZyu 

225 ^ ili2i 3* 4 ^5*6«7«8 + 34 terms) 

_ 540 ^* 1 * 2 '^'* 3 * 4 '^'* 5 * 6 * 7 * 8 209 terms) 
4 

189 ^ ili2 ^ i3iAihi<ii7is ^ 27 terms) 

_A 5 



(121) 



Combining these formulas with Eq. (|52h . and using a similar condensed notation to that used in 
the simplex case (but with the contractions referring now referring to the sums over the hypercube 
of Eq. (|120p ). we have for the integral of a general function over the hypercube, through ninth 
order, 



77j j dx 1 ...dx p f(x) =A + ^C 2 + 777E2+2 - t 1 =E 4 

V ./hypercube 10 



^4 

'hypercube 

+^G 2+2+2 - 1g 4+2 + A Gg 

1 1 1 4 8 

+ 1296 /2+2+2+2 + 225 /4+4 " 540 /4+2+2 + 189 h+2 " 15 



(122) 

A second, and more general way, to obtain these results is to construct a generating function, 
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analogous to that of Good and Gaskins used in the simplex case. We start from the formula 

S S 'P 

V- 1 f d Xl ...l dxpe^--^ =f[ S ^p- , (123) 

and recall the power series expansion for the logarithm of sir ^ 1 - , 

.sinhxs 1 9 1a 1 fi 1r 

log ( ) =-x 2 x 4 H x° x* + ... 

6 V x ' 6 180 2835 37800 



E 



(_l)n+l 2 2n-l B2n _ l ^ 



n(2n) ! 

n=l v ' 



(124) 



with Bm-\ the Bernoulli numbers B\ = g , ,63 = ^ , £5 = ^ , = ^ , £?g = Jr , ... . 
Defining, in analogy with the simplex case (with x a i now the components of the 2p vectors x a of 
Eq. (02])), 



2p / P \ M 

a=l \i=l / 

=0 , u odd , 

P 

=2S U * t > u even > 



(125) 



we rewrite Eq. (|123p . using Eq. (|124p . as 



with 



y- 1 / da*.../ dx p e tlXl+ - +t ? x ? = e ^^ KnW ^ , (126) 
i-S J-s 



(_ l) n + l 2 2n-2^ n _ l 

— — • (127) 



Through eighth order, the right hand side of Eq. (I126p is 

e WVl2-iy 4 /360+WV5670-W s /75600+... = } 

12 

_ w 4 wf 

~360 + 288 

W 6 W2W4 + Wf 



5670 4320 10368 



75600 259200 68040 103680 497664 



(128) 
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Applying the same rule as in the simplex case, of multiplying the coefficient of each term by a com- 
binatoric factor (s \ )H ! for each factor W*, we recover the numerical coefficients in the expansions 
of Eqs. (|12ip and (|122|) . This method can be readily extended to the higher order terms of these 
expansions. 

We now follow the procedure used before in the simplex case. We match the expansion of Eq. 
(|122p to sums over the function / evaluated at points within the hypercube, this time constructing 
these sums using parameter multiples of the 2p points of Eq. (09]), which are the centroids of the 
maximal boundary hypercubes. The formulas of Eqs. (j63j) . ()64|) . and (|65[) still apply, with sums 
that extended from to p in the simplex case extending now from 1 to 2p, and with £ = p + 1 in 
Eq. (pH) replaced by £ = 2p. 

We proceed to set up integration formulas of first, third, fifth, seventh, and ninth order, for 
integrals over an axis-parallel hypercube. Since all odd order terms in the expansion of Eq. (|52p 
integrate to zero by symmetry, to achieve this accuracy it suffices to perform a matching of the 
non-vanishing terms through zeroth, second, fourth, sixth, and eighth order, respectively. We will 
see that as a result of the absence of odd order terms, the higher order hypercube formulas are 
considerably simpler than their general simplex analogs. 

A. First and third order formulas 

We begin our derivation of odd order hypercube integration formulas with examples of first and 
third order accuracy, obtained by matching the first two terms in the expansion of Eq. (I122p , 

I = A + \c 2 + ... . (129) 
o 

Proceeding in direct analogy with the first order formulas of Eq. (j66| in the simplex case, we get 

7 a =Ei(A)/£ , 
h=f(0) , 

(130) 

with x = the centroid of the hypercube, and £ = 2p. These are again analogs of the trapezoidal 
and center-of-bin methods for the one dimensional case. 

Similarly, in analogy with the second order accurate formula of Eq. (|69h for the simplex, we get 
the third order accurate hypercube formula 
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For any two nondegenerate values in the interval (0,1) this gives two different third order 
accurate estimates I a ^ of the hypercube integral. 

B. Fifth order formula 

To get a fifth order formula, we have to use the sums of Eq. (|63p to match the coefficients of 
A, C2, E2+2, and E4 appearing on the first line of Eq. (1122p . Since there is now only a single 
two-partition term, E2+2, we can extract it from ^(A, A) for any A in the interval (0, ^). So we 
look for a fifth order formula of the form 

2 

k 2 S 2 (A,A) + ^4S 1 (A 1 1 ) + K o^ . (132) 

i=l 

Matching the coefficient of -E2+2 gives 

K2 = 7^ ' (133) 

while matching the coefficient of A gives 

2 

k = 1-R , R = ^ K2 + ^k\ . (134) 

i=l 

Matching the coefficients of C2 and E4 gives a N = 2 Vandermonde system (c.f. Eq. (|88p ) with 

.J f\i\2 



X 1 


=(a|; 


» 2 , ■ 




1 




qi 


~6 ~ 


36A 2 




-1 




<12 


~ T5" 


36 



(135) 



C. Seventh order formula 



To get a seventh order formula, we have to match the coefficients appearing on the first two 

lines of Eq. (|122p . Since there is only one three-partition term, G2+2+2) we can extract it from 

Ea(A, A, A) for any A in the interval (0, |). We look for a seventh order formula of the form 

2 3 
«3S 3 (A ! A,A)+^4E 2 (A|,A l 2 ) + ^4s 1 (Ai) + ^ , (136) 

i=l i=l 

with matching the coefficient of G2+2+2 requiring 
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To reduce the number of function calls, we take A^ +1 = 2A 2 , i = 1,2, with only A} an additional 
parameter. Matching the coefficients of the two-partition terms E 2 +2 and G± + 2 gives a N = 2 
Vandermonde system with 

x i= (A 2 ) 2 , Wi = 24(\i) A , 

i £ 



91 

92 



36 216A 2 

-_ J__ J_ 
90 216 



(138) 



while matching the coefficient of A gives 

2 3 

Ko = l-i2o, #o=£ 3 «3 + £ 2 l>i + £l>i • (139) 

8=1 8=1 

Matching coefficients of the single-partition terms C 2 , £4, and Gg gives the iV = 3 Vandermonde 
system with 

Xi =(A{) 2 , ^ = k\(X\) 2 , 



^=i-lll4-2^E4(A 2 ) 
i=i 



6 432A 4 



1 e , e : 



2 



92 15 36 + 432A 2 

93 63 90 432 



(140) 



D. Ninth order formula 



To get a ninth order formula, we have to match the coefficients of all three lines of Eq. (|122p . 
Since there is only one four-partition term I2+2+2+21 we can extract it from S4(A,A, A, A) for any 
A in the interval (0, 4). We look for a ninth order formula of the form 

2 34 
K4 S 4 (A, A, A, A)+]T 4^3(A|, A|, A|)+4S 2 (3A, A)+£ A 2 )+]T /4 s i(M)+ K o^ , (141) 

8=1 8 = 1 8=1 

with matching the coefficient of ^2+2+2+2 requiring 
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To reduce the number of function calls, we take \\ = \\ , i = 1,2, and X\ = 3A 2 , % = 1,2,3, 
with only \\ and \\ additional parameters. Matching the coefficients of the three-partition terms 
G2+2+2 an d Z4+2+2 gives a iV = 2 Vandermonde system with 

^=(A 3 ) 2 , ^ = 6/4(A|) 6 , 

1 e 



g3i 

?3 2 = - 



216 1296A 2 

1 z 



540 1296 



(143) 



Taking the difference of the matching equations for Iq + 2 and 74+4 determines k' 2 to be 



' - 237 _ 237 f, AA \ 
K2 ~ 8164800A8 " 262^ 4 ' (M4j 

while matching the coefficient of A gives 

2 34 

Ko = i-i? , flo=64+^4+^(4+E^)+5E K i • ( 145 ) 

i=l 2=1 2=1 

Matching the remaining independent two-partition terms E2+2, G^+2-, and 14+4 gives a N = 3 
Vandermonde system with 

i'i=(4) 2 , w, = 24(4)" , 

2=1 

1 £ f 2 

q2 2 = k' 2 90A 6 - — + " 



q2 



3 



90 2 216 2592A 2 

— - 4162A 8 + 4- + — 



225 z 540 2592 



(146) 
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Finally, matching coefficients of the single-partition terms C2, E4, Gq, and Ig gives a N = 4 
Vandermonde system with 

x i= (Al) 2 , Wi = K \(X\) 2 , 
qh =1 - - - 3f jz 4(A 3 ) 2 - 2£ E 4(A 2 ) 2 , 

i=l i=l 

9l2 = " 15 - 777^4 " 482£A 4 - 3£ 2 E 4(A 3 ) 4 - C ^ , 

i=i 

qh ~ ~ t^To " 4?30£A 6 - ^ 2 ?3i - e 9 2 2 , 



63 7776A 2 2 2 
?U = - ^ - - 46562eA 8 - l -e <?3 2 - e q2 3 



(147) 



E. Leading term in higher order 

As in the simplex analysis, in the hypercube case we have not systematically pursued con- 
structing integration formulas of orders higher than ninth, but this should be possible by the same 
method. Again, one can see what the pattern will be for the leading term in such formulas. An 
integration formula of order 2t + 1 will have a leading term £j (A, A), with t arguments A. The 
only t-partition term appearing in the continuation of Eq. (|122f) will be 2 + 2 + .... + 2, containing 
t terms 2. So A can be taken to have any value in the interval (0, j), and the leading term in the 
integration formula will be «tS t (A, A), with K t given by 

* = tw& ■ (148) 

Where nonleading terms give multiple equations of the same order, corresponding to inequivalent 
partitions of 2i,... one has to include terms proportional to Sj_2(3A, A, A), and other such 
structures with asymmetric arguments summing to tX, for the differences of these multiple equations 
to have consistent solutions. Once such multiplicities have been taken care of, the remaining 
independent equations will form a number of sets of Vandermonde equations. 

F. One dimension revisited: comparison of Vandermonde moment fitting to standard one 

dimensional methods 



In this subsection we address several related issues. We first set up a one-dimensional analog 
of the moment fitting method that we have used in general p dimensions to develop higher order 
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integration formulas. The one dimensional moment fitting equations can be satisfied by leaving 
the sampling points as free parameters, giving in any order a Vandermonde equation system to 
determine the weights assigned to the sampling points. Alternatively, the moment fitting equations 
can be satisfied by restricting the sampling points, which is what is done in Gaussian quadrature, 
which reduces the number of function calls. We then show that the fifth (and higher) order direct 
hypercube integration formulas for p dimensions, when restricted to one dimension, involve a larger 
number of function calls than needed for the case when the sampling points are all free parameters. 
We interpret this as resulting from linear dependencies in low dimension among the various terms 
appearing in the generating function of Eq. (1128P . We study these linear dependencies as a function 
of dimension p, and suggest that the integration rule of order 2t + 1 has redundant parameters 
when the spatial dimension p < t. 

Let f(x) = fo + fix + f2% 2 + f3X 3 + /4X 4 + ... be a function that is power series expandable on 
the interval (-1,1), and consider the one dimensional integral 




f .J2.J4.J6.J8. 



(149) 



Defining Si (A) by 



Sx(A) = /(A) + /(-A), 0<A<1 



(150) 



we look for an integration formula of the form 



n 



n 



=/o + £ Afo + / 2 (A 1 ) 2 + / 4 (A 4 ) 4 + / 6 (A*) 6 + / 8 (A*) 8 + ...] 



i=l 



(151) 
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Matching the coefficients of ft between Eq. (|149p and Eq. (|15ip , we get the system of equations 

n 

1 =Ko + 2 ^ K< » 

1=1 

6 i=l 
1 n 

1=1 

n 

-= 2 J>'(A*) 6 , 
i=i 

y i=l 

(152) 

and similarly if one wishes to go to higher order than ninth. 

There are now two ways to proceed to solve the matching equations, to give a discrete approxi- 
mation to the integral to a given order of accuracy. The first, which is what we have done in getting 
simplex and hypercube integration formulas, is to regard all of the A* as adjustable parameters, 
and to determine the coefficients k % to satisfy the system of equations of Eq. (|152p to the needed 
order. Thus, to get a first order accurate formula, we take I ~ /(0) with all the k 1 equal to zero, 
which is the center-of-bin rule. To get a third order accurate formula we must take k 1 as nonzero 
and solve the system 

1 =k + 2k 1 , 
3 V ; 

(153) 

To get a fifth order accurate formula we must take both k 1 and k 2 as nonzero and solve the system 

1 =K + 2(/t X + K 2 ) , 

i =2k 1 (X 1 ) 2 + 2k 2 (\ 2 ) 2 , 

^=2 k 1 (A 1 ) 4 + 2k 2 (A 2 ) 4 , 
5 

(154) 
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to get a seventh order accurate formula we must take k 1,2 ' 3 as nonzero and solve the system 

1 =K + 2<y + K 2 + K 3 ) , 

-=2k 1 (X 1 ) 2 + 2k 2 (X 2 ) 2 + 2k 3 (X 3 ) 2 , 
3 

i=2K 1 (A 1 ) 4 + 2K 2 (A 2 ) 4 + 2 K 3 (A 3 ) 4 , 
5 

i =2/ t 1 (A 1 ) 6 + 2K 2 (A 2 ) 6 + 2 K 3 (A 3 ) 6 , 

(155) 

and so forth. Evidently, to get an order 2t + 1 formula, we must take n = i, so that there are £ 
distinct positive sampling points A and to determine the coefficients re 1 '—'* we must solve an 
order N = t Vandermonde system. The resulting order 2t + 1 integration formula uses 2t + 1 
function values. 

An alternative way to proceed is to adjust the values of the sampling points so that fewer of them 
are needed to satisfy the matching conditions. This is what is done in the well-known Gaussian 
integration method, which gives a more efficient scheme, in terms of the number of function calls, 
starting with third order. Referring to Eq. (|153p . we can evidently achieve a third order match by 
taking 

2k =0 , 2k 1 = 1 , 



(156) 



Similarly, referring to Eq. (|154p . we can evidently achieve a fifth order match by taking 

8 5 
u 9 9 ' 

\/5 



(157) 



Proceeding in this way, we can obtain the general Gaussian integration formula, which for order 
2t + 1 integration involves t points. Of course, the usual derivation of the Gaussian integration 
rule does not proceed this way, but instead uses an argument based on one dimensional polynomial 
long division to relate the special points A* to zeros of the Legendre polynomials. Since in higher 
dimensions there is no analogous polynomial division rule, there is no universal higher dimensional 
analog of the Gaussian integration rule, although there are a multitude of special formulas using 
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specially chosen sampling points in higher dimensions (see, e.g., Stroud (1971)). On the other hand, 
as we have seen, the method of keeping all the sampling points A 4 as free parameters, and solving 
a set of Vandermonde equations to get the coefficients k 1 , readily extends to higher dimensions. 

Let us now examine the number of function evaluations required by our general moment fitting 
formulas for hypercubes, when restricted to one dimension. The first order center-of-bin formula 
requires just the one function evaluation /(0), and so is the same in all methods. The third order 
formula of Eq. (|13ip involves /(0) and one Ei(A), and so uses 3 function values, in agreement 
with Eq. (|153p . whereas the Gaussian formula needs 2 function values. Turning to the fifth order 
formula of Eq. (|132p . calculation of £2 (A, A) requires 3 function evaluations, calculation of each 
of the two Ei (A 1 ) requires 2 function evaluations, and evaluation of /(0) requires one function 
evaluation, for a total of 8 function evaluations. This is to be compared to the one dimensional 
moment fitting formula of Eq. (|154p which requires 5 function evaluations, and the Gaussian 
method, which requires 3. 

The reason that the fifth order integration formula for general p, when specialized to one di- 
mension, requires more function evaluations than the moment fitting method of Eq. (|154p . is that 
whereas in two and higher dimensions W4 and W 2 2 are linearly independent, in one dimension they 
are proportional to one another by virtue of the identity t\ = (t 2 ) 2 . Hence the term S2(A, A) in 
the general integration formula is not needed to get a match, and when this is dropped one has 
a formula identical in form to that of Eq. (I154p . requiring only 3 function calls. Turning to the 
higher order hypercube formulas, we see that the seventh order hypercube formula of Eq. (|136p 
has redundant parameters and function calls for dimension p < 3, since in 2 dimensions Wq, W2W4, 
and Wf are linearly dependent by virtue of the algebraic identity 

= (t 2 1 +t 2 2 ) 3 -3(tl + t 2 2 )(tt+4) + 2(4 + 4) . (158) 

Similarly, the ninth order hypercube formula of Eq. (|14ip has redundant parameters and function 
calls for dimension p < 4, since in 3 dimensions W§, W 2 , \V2We, Wf W4, and W 2 are linearly 
dependent by virtue of the identity 

={t l +t l + _ 6(t 8 +t S + ^ + 3(t 4 + f 4 + t A } 2 

+8(t 2 + 4 + 1 2 )(4 + 4 + 4) - 6(t 2 + 4 + 4) 2 (tj + 4 + 4) . 

(159) 

These results suggest the conjecture that the hypercube formula of order 2t + 1 will involve 
redundant parameters and function calls for dimension p < t, and we expect an analogous state- 
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ment to apply for the simplex formulas derived by the moment fitting method in Sec. VII. This 
redundancy for small p is a consequence of the fact that the integration formulas that we have 
derived for simplexes and hypercubes are universal, in the sense that they involve the same num- 
ber of parameters irrespective of the dimension p. As p increases, the number of sampling points 
increases, but the number of parameters, and the size of the Vandermonde systems needed to find 
coefficients, remains fixed. 

X. FUNCTION CALLS NEEDED FOR INTEGRATION ROUTINES OF VARIOUS 

ORDERS 

We summarize in this section the number of function calls needed for a single call to the 
integration routines of various orders. These are obtained by running the programs to integrate 
the function fcn=l, in which case the programs exit without subdividing the base region, giving 
the desired function call count for two samplings of the integral at the indicated order of accuracy, 
as well as unity as the output integral (since the programs all compute the integral over the base 
region, divided by the base region volume). 

In Table VI we give the function call counting for the simplex integration programs of first 
through fourth, fifth, seventh, and ninth order. For comparison, in Table VII we give a similar table 
from the paper of Genz and Cools (2003), which gives the function call counting for one evaluation 
of the indicated order, plus a second evaluation at a lower order used for error estimation. Unlike 
our method, which proceeds directly from the vertices of a general simplex, the Genz and Cools 
program uses integration rules for a standard p-simplex, with an affine transformation needed to 
treat more general simplexes. Although not directly comparable, the two tables show that the 
strategy we have used, of incorporating a number of free parameters into the integration which can 
be used to give different samplings of the integrand, does not lead to an inefficiency of more than 
a factor of 2 to 3 compared to the method used by Genz and Cools. 

In Table VIII we give the function call counting for the direct hypercube programs of first, 
third, fifth, seventh, and ninth order. For comparison, in Table IX we have tabulated t p + (t + l) p , 
with the odd order of integration n related to t by n = 2t + 1; this is the number of function calls 
needed if one uses a p-fold direct product of Gaussian integrations of indicated order, together 
with a p-fold direct product of Gaussian integrations of the next higher odd order to get an error 
estimate. One sees from these tables that for t = 1, 2, 3 our parameterized method is more efficient 
than direct product Gaussian for dimension p > 4, and for t = 4 the parameterized method is more 



55 



TABLE VI: Function calls by for simplex integration of order n in dimension p by method of Sec. VII 
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7 37 71 117 176 249 337 441 562 701 
9 74 168 316 531 827 1219 1723 2356 3136 

TABLE VII: Function calls for simplex integration of order n in dimension p from Genz and Cools (2003) 

np^ 2345678 9 
3 7 9 11 13 15 17 19 21 
5 16 23 31 40 50 61 73 86 
7 32 49 86 126 176 237 310 396 
9 65 114 201 315 470 675 940 1276 

efficient for p > 5. Since the number of function calls in the parameterized method is asymptotically 
polynomial of order (2p) l /t\, whereas in the direct product Gaussian method it is exponential in 
p, the parameterized method becomes markedly more efficient for large dimension p. 

These results reinforce the indication from the previous section that, as a very rough rule of 
thumb, in using integration routines with n = 2t + 1 in dimension p, one should avoid high order 
routines with t > p. This is true both because in low dimension the higher order routines have 
redundant function calls, and because the extra computation involved in using a high order routine 
is justified only when the 2 P scaling in the number of subregions, as the program subdivides from 
level to level, becomes large enough. However, this is only a very general criterion, since the 
optimum choice or choices of integration routine order will depend on the nature of the function 
being integrated. Moreover, in dimension p = 1 the programs are so fast on current computers that 
use of the fifth or seventh order integration routines, while not as efficient as Gaussian integration, 
still gives good results. 
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TABLE VIII: Function calls for hypercube integration of order n in dimension p by method of Sec. VIII 
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TABLE IX: Function calls for hypercube integration of order n in dimension p by comparison of two product 
Gaussian rules 

np-^123 4 5 6 7 
3 3 5 9 17 33 65 129 
5 5 13 35 97 275 793 2315 
7 7 25 91 337 1267 4825 18571 
9 9 41 189 881 4149 19721 94509 

XI. PUTTING IT ALL TOGETHER SKETCH OF THE ALGORITHMS 

We are now ready to give a sketch of the adaptive algorithms incorporating the elements de- 
scribed above. The basic algorithm starts from a base region, which acts as the initial level 
subregion, which is either a standard simplex, a Kuhn simplex (for hypercube integration treated 
by tiling with Kuhn simplexes), or a half-side 1 hypercube. It then proceeds recursively through 
higher levels of subdivision, by evaluating the integral using an integration method of order speci- 
fied by the user with two different parameter choices, giving two estimates of the integral over the 
subregion divided by the subregion volume, which we denote by I a (subregion) and If, (subregion). 
(Dividing out the volume is convenient because of the 1/V factor appearing on the left hand side of 
Eqs. (I62h and (1122j) .) If the level number exceeds a user-specified value ithinlev which determines 
when thinning begins, then a thinning condition is applied. When the user-specified thinning 
function parameter ithinfun is given the value 1, the thinning condition used is 

| I a (subregion) — If, (subregion) | < e , (160) 

with e an error measure specified by the user. (Further thinning options will be discussed shortly.) 
If this condition is met, the results are retained as contributions to the I a and estimates of 
the integral divided by the base region volume, and the subregion is not further subdivided. If 
this condition is not met, then the subregion is subdivided into 2 P subregions, and the process is 
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repeated. The process terminates when either the thinning condition is met for all subregions, or 
a limit to the number of levels of subdivision set by the user is reached. In the latter case, the 
contributions of the remaining subregions that have not satisfied the thinning condition are added 
to the I a and lb totals, as well as to the sum of the absolute values of the local subinterval errors. 

With either termination, we get the final estimates of the integral divided by the base region 
volume, 

I a — 5^ ^ (subregion) I a (subregion) , 

subregions 

lb — V (subregion) J& (subregion) 

subregions 

(161) 

Here y(subregion) is the subregion volume divided by the base region volume, and since the 
subregions are a tiling of the initial base region, we have 

^(subregion) = 1 . (162) 

subregions 

From the difference of I a and lb we get an estimate of the error, given by 

|outdiff| = \I a - I b \ . (163) 

We can also (as in the one dimensional illustration) compute the sum of the absolute values of the 
local subinterval errors, 

errsum = V (subregion) \I a (subregion) —lb (subregion) | . (164) 

subregions 

Comparing Eqs. (116ip . fjl63f) . and (I164p . we see that errsum and |outdiff| obey the inequality 

errsum > |outdiff| , (165) 

with equality holding if I a — lb has the same sign in all subregions. When the condition 
| I a (subregion) — lb (subregion) | < e is met for all subregions, errsum reduces, using Eq. (j!62|) . 
to 



errsum < e . 

Hence to evaluate the integral to a relative error 5, one should choose 



(166) 
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Since I a and If, give the integral over the base region divided by the base region volume, to get the 
value of the integral without normalization by the base region volume, one must multiply these 
outputs by the base region volume V$. For a standard simplex, Vq = l/p\, for a side 1 hypercube, 
Vq = 1, while for a half-side 1 hypercube, Vq = 2 P . 

Note that the thinning condition determining whether to subdivide a subregion does not include 
a factor of the subregion volume; we are testing variances of the integrand as sampled over the 
subregion, not variances of the net contribution to the integral. This may seem counter-intuitive, 
but is motivated by the formulas of Eqs. (|162p - (|166p . by simplicity, and by the fact that it works 
well in practice. The problem with including a subregion volume weighting factor in the thinning 
condition is that at a very fine level of subdivision, there are many subregions, and so small error 
contributions from each can add up to a large error in the total. Since the local test does not 
involve comparisons of the errors from different regions, the calculation in each subregion proceeds 
independently from that in all the others. The local thinning condition that we use is equivalent 
to the "Local Subdivision Strategy" described in the monographs of Krommer and Ueberhuber 
(1991) and Ueberhuber (1995) using a parameter e a b s , which plays the role of our e. 

Using 1 1 a (subregion) — 1^ (subregion) | as the basis for a thinning decision is only one possibility 
of many. More generally, given A = I a (subregion) and B = 1^ (subregion), one can take as the 
thinning function any function f(A, B) with the properties f(A, B) > and f(A, B) = iff A = B, 
imposing now the thinning condition f(A,B) < e. In the programs, we have included three 
options, (1) f(A,B) = \A - B\ as in the discussion above, (2) f(A,B) = \A - B\/\A + B\, and 
(3) f(A, B) = (A — B) 2 . In many cases, and in particular for polynomial integrals, we found their 
performance (with appropriate e) to be similar, but for the singular integral & x ^\_ x i we found 
choice (3) to perform considerably better than the other two. 

Three versions of the basic algorithm are presented in each of the directories of programs. In 
the first, the algorithm subdivides until all subregions obey the thinning condition, or until a preset 
limit on the level of subdivisions is reached, which is dictated by the available memory. Typically, 
for simple integrands and moderate dimension p, this happens rather quickly, in other words, the 
algorithm has saturated capabilities of the machine memory, but not of the machine speed. In a 
second version labelled "r" , the algorithm is "recirculated" by keeping, at a level limit set by the 
user which is chosen to avoid exceeding machine memory capabilities, all the sub intervals that do 
not obey the thinning condition. These are then treated one at a time by the same algorithm, 
up to a second level limit again set by the user. This can take hours or days for high accuracy, 
high p computations, with a practical limit set by the speed capabilities of the machine. Finally, 
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a third version labelled "m" takes the "recirculating" algorithm and parallelizes it using the MPI 
(message passing interface) protocol, by distributing to each process of a cluster a large number 
of the subintervals that do not obey the thinning condition , each of which is then processed by 
the algorithm sequentially. This speeds up the computation by a factor of the number of processes 
available. All routines are coded in double precision, but since the ninth order integration formulas 
involve large numbers in computing coefficients, double precision computation is not enough to 
give double precision accuracy results, so for the fifth, seventh, and ninth order routines in both 
the simplex and direct hypercube cases, we also give a quadruple precision (real(16)) version of 
the programs. 

The programs present the user with various options. By an appropriate choice of ithinlev, 
thinning can be delayed, or even suppressed entirely so that all subdivisions take place to the 
specified subdivision limits. This can give a check that subregions with large contributions, but 
accidentally small error estimates, have not been harvested prematurely, and when the programs 
are modified, gives a useful check that the tiling condition of Eq. (|162p is obeyed. By a choice of 
ithinfun, the user can choose which of three preset thinning functions to use, or by modifying the 
subroutine containing these functions, the user can make another choice of thinning function. For 
simplex integration, the user can choose whether to use the recursive or the symmetric subdivision 
algorithm. The user can choose the accuracy of the integration method used: first through fourth, 
fifth, seventh, or ninth for simplex based routines, and first, third, fifth, seventh, and ninth for 
the direct hypercube routines. Finally, the user can modify the free parameters in the integration 
routines, so as to get different samplings of the integrand, which can give a useful assessment of 
whether the error estimates from the initially used sampling are realistic. 



XII. TEST INTEGRALS; FALSE POSITIVES AND THEIR AVOIDANCE 

For verifying the higher order integration programs, and for checking the operation of the 
adaptive programs, it is essential to have test integrals with known answers. For the standard 
simplex (c.f. Eqs. (130p and (I3ip ). a useful formula is the multinomial beta function integral, 



d Xl ...dx p (l- Xl -x 2 - ... - s p ) Q °- 1 s? 1 - 1 ... a ff'- 1 = P^ r(aa) , (168) 

standard simplex \Z-^a=0 ^ct) 

with r the usual gamma function (see the Wikipedia article on Dirichlet distributions). When 
a a — 1 = v a , a = 0, ...,p with v a an integer, this can be rewritten as 

/ d Xl ...dx p (1 - X! - x 2 - ... - x P Y°^...x v p p = Val . (169) 

J standard simplex \P ' 2—/a=0 ' 
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The uq = case of this formula is the formula given by Stroud (1971) (see also Grundmann and 
Moller (1978)) for the integral of a general monomial over the standard simplex. 
For a unit hypercube, the corresponding formula is 



f 1 f 1 v v P 1 

/ dxx... / dxpX^ 1 ...Xp v = TT — , 

Jo Jo i=i Ue + 



while for a half-side 1 hypercube the corresponding monomial integrals are (c.f. Eqs. ()118p and 



Testing the simplex programs with the integral of Eq. (1169[) . and starting thinning at level 1, 
shows that when the order of the monomial is less than or equal to the order of the integration 
formula used, the iteration terminates at the initial level, and the difference between I a and lb is of 
order the computer truncation error. When a monomial is integrated that is of higher order than 
the integration formula used, with a small enough error measure e, the adaptive program starts to 
subdivide the base region. 

However, a more complicated pattern is seen for the hypercube integrals when evaluated by the 
direct hypercube algorithms, and this brings us to the issue of false positives. As in the simplex 
case, when thinning is started at level 1 and the order of the test monomial is less than or equal 
to the order of the integration formula used, the iteration terminates again at the initial level, and 
I a — It, is of order the truncation error. However, when a monomial is integrated that is of higher 
order than the integration formula used, the adaptive program does not always start to iterate. For 
example, using the fifth order hypercube formula in dimension p = 4, the program iterates for the 
integrand x\, but not for the integrand xfx^x^. The reason is that the latter function, although of 
higher order than that of the integration formula, vanishes on the hyperplanes spanning the axes 
where the fifth order integration formula samples the integrand, and so the I a and lb evaluations 
give the same answer (zero), and the thinning condition is obeyed for arbitrarily small e. This is an 
example of a false positive, in which the thinning condition is obeyed even though the actual error 
is large. Any sampling program for evaluating integrals is subject to false positives for functions 
that take special values (in our case zero, or a constant) on the sampling points. Since the sampling 
points in the simplex integration formulas are on oblique, rather than axis-parallel, lines or planes, 
this problem is not so readily seen with the multinomial test functions of Eq. (I169p . but we 
have nonetheless found examples of false positives. For example, using fifth order integration and 
symmetric subdivision, the p = 5 monomial x(l)a;(2)x(3)x(4) 2 x(5), when computed with thinning 




for all ui even, and zero otherwise 



(171) 
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starting at any level below 3, develops a false positive at level 2 and gives an answer that is wrong 
in the fourth decimal place, even though the output error measures suggest much higher accuracy. 

There are several general ways to guard against false positives. The simplest is to use the freedom 
of choosing the parameter ithinlev to delay thinning until several subdivisions have taken place. 
False positives are most dangerous if they occur in the initial few levels, since these have the largest 
subregions, and if a subregion is prematurely harvested, there is a possibility of significant error. On 
the other hand, thinning becomes most important after several subdivisions have taken place, when 
the number of subregions is large. So there can be a useful tradeoff between starting thinning early 
and starting it late. If computer time permits, one can always do an a posteriori check by choosing 
ithinlev greater than the limit on the number of levels, which suppresses thinning altogether, and 
gives the approximate Riemann sum corresponding to the level of subdivision attained. 

A second general way to guard against false positives is to compute the integral using alternative 
options, for example, using integration programs of several different orders, or where allowed as an 
option for simplex integrals, to use recursive instead of symmetric subdivision. In the fifth order 
p = 5 example noted above, changing to seventh order integration, or changing from symmetric to 
recursive subdivision while maintaining fifth order integration, both eliminate the false positive at 
level two. 

A third way is to add a function with known integral to the integrand, which has significantly 
different local behavior, and to subtract its known integral from the total at the end. For example, 
in the hypercube case, consider the integral 

= y\(z), Mx ) = ^-±-^ - -±- , (172) 

which exists for any q > 1. Adding a multiple of 

v 

Y[M*e) ( 173 ) 

to the test monomial integrands does not change the expected answer, but forces the adaptive 
program to start to subdivide at level 1 (for small enough e) in all monomial cases. It is of 
course not necessary for the added function to have an integral that can be evaluated in closed 
form. In the p = 5 simplex case discussed above, we eliminated the false positive at level 2 by 
numerically integrating the function (l + x(l)) _1 , and then adding a multiple of this function to the 
integrand and subtracting its integral from the answer. When adding such an auxiliary function, it 
is probably a good idea to rescale it so that its order of magnitude is similar to that of the integral 
being evaluated. Clearly there is an infinite variety of such auxiliary functions that can be added 



62 



to the integrand, each of which shifts the false positive problem to a different part of integrand 
function space. Even when one is dealing with generic integrands, in which the program starts to 
subdivide as expected, adding such functions will alter the pattern of subdivision, and can be used 
(in addition to changing the integration formula parameters) to give further estimates of the errors 
in the output values l a j> provided by the integration algorithm. 

We do not recommend just changing the integration formula parameters as a way of eliminating 
false positives. The reason is that the samplings in both the simplex and hypercube cases take 
place on hyperplanes that are determined by the general structure of the integration formulas, 
but do not vary as the parameters in the integration formulas are changed. So if a false positive 
is associated with a zero or constant integrand value on one of these hyperplanes, it will not be 
eliminated by changing the parameter values. Similar remarks apply to changing the thinning 
function as a way of eliminating false positives. 

For related reasons we have not written into the programs another way of creating a criterion 
for thinning, the comparison of results from integration programs of different orders (say, of fifth 
and seventh order). In the simplex example discussed above, doing this would eliminate the 
false positive, since the seventh order routine uses sampling points that avoid the problematic 
hyperplanes sampled by the fifth order routine. However, in this case one may as well do two 
seventh order samplings to set up the thinning condition , and thus benefit from the higher accuracy 
accruing from use of the seventh order routine for smooth integrands. 

XIII. DESCRIPTION OF PROGRAMS IN THE SEVEN DIRECTORIES 

A. General description 

The Fortran programs are grouped into 7 directories, named simplexl23, simplex4, simplex579, 
simplex579_16, cubel3, cube579, and cube579_16. All programs are valid for arbitrary dimension 
V > 1 

The simplex programs all perform adaptive integration over a standard simplex or a Kuhn 
simplex with one vertex at the origin, using real(8) precision (except for simplex579_16, which 
uses rea 1(16)). The programs in simplexl23 perform first through third order integration, the 
programs in simplex4 perform fourth order integration, and the programs in simplex 579 perform 
fifth, seventh, or ninth order integration. 

The same adaptive program treats both the standard and Kuhn simplex cases, with a subroutine 
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argument "Linit" determining which initialization is used. Included in all the simplex packages 
are programs for integration over a side 1 hypercube with one vertex at the origin, by tiling with 
Kuhn simplexes followed by adaptive simplex integration. 

The programs in simplex4 perform fourth order adaptive integration using a different subdivision 
strategy from that used in all the other cases. In simplex4 the simplex vertices are used as sampling 
points, with the side midpoints giving the vertices at the next level of subdivision. In all the 
other programs, only interior points of the simplex are used for sampling. Hence, the simplex4 
programs cannot be used to integrate functions which have integrable singularities at the base 
simplex boundary, whereas the other programs can be used in this case. 

The programs in cubel3 perform first or third order adaptive integration, and those in cube579 
perform fifth, seventh, or ninth order adaptive integration, over half-side 1 hypercubes centered on 
the origin, with real(8) precision. These programs use less memory (by roughly a factor 1/p) than 
the hypercube tiling programs. The cube programs are valid for arbitrary dimension p > 1. 

The programs in simplex579_16 are real(16) re- writings of those in simplex579, and the programs 
in cube579_16 are real(16) re-writings of those in cube579. The real(16) versions are obtained from 
the corresponding real(8) programs by making the following global substitutions: (1) Replace "dO" 
by "qO", (2) replace "implicit real(8)" by "implicit real(16)", (3) replace "dabs" by "qabs", (4) 
replace "d20.13" by "d32.36". These changes can be made using a "replace all" utility, since the 
strings that have to be modified do not occur anywhere else in the programs. Note that explicit 
data type declarations that override the implicit ones are not changed. 

Each directory contains a package of subprograms, labeled respectively simplexsubsl23.for, sim- 
plexsubs4.for, simplexsubs579.for, simplexsubs579_16.for, cubesubsl23.for, cubesubs579.for, and 
cubesubs579_16.for. The subroutines in these packages do not have to be accessed by the user 
in normal operation of the adaptive programs. If they are accessed to alter the programs, we 
strongly recommend doing several test integrals before and after the changes, to make sure they 
still operate correctly. Each directory also contains a series of main program files, and each main 
program file contains the main program proper, as well as a subroutine setting up the function to 
be integrated, subroutines setting up the free parameters used in the parameterized integrations, 
a subroutine setting up three options for the thinning function, and in the case of the Kuhn tiling 
treatment of hypercubes, a subroutine symmetrizing the function to be integrated over all its vari- 
ables. Each program that requires user setting of input parameters contains comment statements 
giving instructions. To run the programs, the user must compile and link the subroutine package 
in a directory with the appropriate main program file in the same directory. 
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As noted in the section on Vandermonde solvers, all programs are self-contained, since their 
subroutine packages include Vandermonde solvers that compute the explicit solution of the Vander- 
monde system for the relevant values of N. Because the ninth order simplex integration routines 
and associated Vandermonde equations involve large numbers in computing coefficients, use of 
real(16) is recommended if one wants to get answers with real(8) accuracy. Solving the Vander- 
monde equations to get the coefficient parameters for the integrations need be done only once before 
adaptive integration begins; this is done in the subroutines with names beginning with "ext" , the 
output of which is then fed to the integration programs that are used repeatedly in the adaptive 
integration process. 

There are three generic types of main programs in each directory. Those with names not ending 
in "r" or "m" execute adaptive integration to a subdivision level set by the user (and limited by 
machine memory). Those with names ending in "r" execute the "recirculating" routines, in which 
after the first stage of subdivision, the remaining subregions are subdivided sequentially in a second 
stage to a second level of subdivision set by the user. Those with names ending in "m" execute an 
MPI parallel version of the "recirculating" routines, in which after the first stage of subdivision, 
the remaining subregions are farmed out to the available processes for a second stage of subdivision 
to the second level of subdivision set by the user. 

In order to conserve memory, the labeling of simplex and cube points and the simplex subdivision 
routines use a lattice built on integer(2) arithmetic. This allows 14 levels of subdivision in the initial 
stage, since 2 14 =16384, which is half the maximum integer representable in integer(2). In order 
to go beyond 14 levels of subdivision in one stage, say to 30 levels of subdivision, one would have 
to replace 16384 in the subroutines by 2 30 = 1,073,741,824, which is half the maximum integer 
representable in integer (4), replace all integer (2) data type declarations by integer (4), and enlarge 
the level number limits in the programs. The explicit limits in the programs on the number of levels 
correspond to the requirement that the minimum integer(2) lattice spacing must not be smaller 
than 1, since in integer arithmetic 1/2 is replaced by 0. Program stages that pass on subdivided 
regions have a limit of 14 levels, while output stages that do not pass on subdivided regions have a 
limit of 15 levels. An exception to this rule is in the simplex4 programs, where there is an explicit 
division by 2 in the programs, and so the corresponding limits are 13 and 14. Note that in integer(2) 
arithmetic, 16384/2 + 16382/2 = 16384 / (16384 + 16384)/2 = (-32768)/2 = -16384, which is 
why in the simplex4 integration program we have not regrouped added terms into parentheses. 

The recirculating and MPI programs make use of the observation that symmetric (or recursive) 
subdivision of standard simplexes, symmetric and recursive subdivision of Kuhn simplexes, and 
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hypercube subdivision, all give after £ subdivisions a subregion that fits within a hypercube of side 
l/2 e (or 1/2 £_1 ). This observation, which is an unproved conjecture supported by our numerical 
results in the case of standard simplexes, permits a doubling of the number of levels attainable 
within integer(2) arithmetic in the "r" and "m" programs, as follows. At the start of the second 
stage of subdivision, each subregion is translated by a shift vector and is rescaled by a factor which 
expands it to just fit within the initial lattice containing base region. This permits another 15 (or 
for recursive subdivision, 14) levels in the second stage (with corresponding limits in the simplex4 
programs of 14 (or 13)), and so the "r" and "m" programs can subdivide to subregions that have 
a dimension 2~ 28 = 3.725 x 1CT 9 of the base region dimension. Whether this can be attained 
in practice for a given dimension p of course depends on available machine memory. Subdivision 
limits appropriate to the various cases have been incorporated into the main programs. 

Because simplex points are represented in integer(2) arithmetic, in order to apply the simplex 
subroutines to a starting simplex that does not have only Os or Is in the vertex coordinates (for 
example, an equilateral triangle), one would have to change the integer (2) data type declarations 
to real(4) for the programs to work correctly. This change increases the memory requirements, and 
should not be made unless needed. We note also that with the aim of conserving memory, we have 
used allocatable memory to store subregion information, allocating memory where needed at each 
level of subdivision, and deallocating memory when no longer used. 

Finally, we note that the MPI programs are written using only simple MPLSend and MPLRecv 
commands. All processes simultaneously carry out the first stage of subdivision, and then each 
process of rank greater than takes its share of the remaining subregions after the first stage and 
processes them further. This wastes some processor time, but avoids large data transfers. Only at 
the end, when all processes of rank greater than have finished, is their output combined in process 
0. Because MPI can only pass real(8) numbers as messages, the real(16) MPI programs give only 
real(8) output. (This is one of the reasons why the explicit real(8) declarations are not modified 
in the conversion substitutions leading to real(16) programs.) Nevertheless, the MPI programs 
compute the sensitive parts of the high order integrations in real(16), converting to real(8) only at 
the end when process outputs are combined. 

To enhance readability of the programs, we have used indents to show the different levels of 
"if" chains, except in one place in the MPI programs, where we have given the "if", "else if", and 
"end if" lines statement numbers 97,98,99. We have not indented the contents of "do" loops, since 
these always begin and end with a statement number, and never with an unnumbered "enddo". 
(The one exception to this is in the subroutine BestLex used for the symmetrization step in the 
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Kuhn tiling programs for integration over hypercubes, which has been taken verbatim from H. D. 
Knoble's (1995) website.) This is of course a matter of taste; our feeling was that indenting both 
the "if" chains and "do" loops would result in so many levels of indents that readability of the 
programs would be decreased. We also remind the reader that the direct hypercube programs were 
written by minimal modification of the simplex programs, changing array arguments where needed 
(e.g., "ip+1" for simplexes becomes "2*ip" for hypercubes), but not changing array names. So the 
array names in the direct hypercube subroutines are not the ones that would naturally be chosen 
if these programs were written without reference to the simplex case. 

B. Inputs 

The main programs require the following inputs to be set by the user: 

1. ithinlev tells the program when to begin thinning subregions, by harvesting those that obey 
the thinning condition of Eq. (|160p . Thinning begins when the total level number exceeds 
ithin. Thus, with ithin = 0, thinning begins at level 1, while if ithin is greater than or equal 
to the maximum total level number, there is no thinning. 

2. ithinfun tells the program which thinning function option to use. As explained in Sec. X, 
ithin fun = 1 corresponds to a thinning function f(A,B) = \A — B\, ithinfun = 2 to 
f(A,B) = \A- B\/\A + B\, and ithinfun = 3 to f{A,B) = (A - B) 2 . 

3. isubdivision tells the simplex programs whether to use symmetric subdivision (isubdivision = 
1) or recursive subdivision (isubdivision = 2). This parameter does not appear in the 
main programs in cubel3, cube579, and cube579_16, where there is no choice of subdivision 
methods. 

4. iaccuracy tells the programs to use the integration program of order iaccuracy. For example, 
in the simplexl23 programs, to select third order accuracy one sets iaccuracy = 3, and in 
the simplex 579 programs, to select seventh order integration one sets iaccuracy = 7. This 
parameter does not appear in the main programs in simplex4, which uses only fourth order 
integration. 

5. ip gives the spatial dimension p of the simplex or hypercube being integrated over, and can 
take any integer value > 1. Thus, to integrate over a three dimensional cube one would set 
ip = 3. 
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6. eps sets the parameter e appearing in the thinning condition of Eq. (|160|) . For ithinfun=l, 
this gives an absolute error criterion; to achieve a given level of relative error, one needs 
a rough estimate of the value of the integral as given by outa or outb, which can be used 
to readjust eps by use of Eq. (|167p . For nonsingular integrands, the eps value when using 
ithinfun=3 should, as a first guess, be taken as the square of the eps value that one used for 
ithinfun=l. Note that if ithinlev is greater than the total level number, so that thinning is 
suppressed, the results are independent of the value given to eps. 

7. In all programs other than the hypercube tiling program, the external function is supplied by 
the user in the subroutine fen. for. In the tiling programs, fen. for is instead the symmetriza- 
tion program for the external function supplied by the user in the subroutine fcnl.for. 

8. Mm sets the limit to the number of subdivisions in the programs with names not ending 
in "r" or "m". It can be any integer between 1 and 15, except in the simplex4 programs, 
where the range is 1 to 14. In practice, the effective upper limit is set by machine memory. 
Start with a low value of Mm, and then to improve the accuracy, increase it until you get 
a diagnostic saying memory has been exceeded; the value of Mm one less than this is the 
maximum value Mm = LMAX that does not exceed memory. Since the final level I = Mm 
does not further subdivide, this limit is associated with the number of subregions carried 
forward from level I — 1 to the final level. As Mm is increased the execution time will increase, 
and this will also impose an effective upper limit. 

9. Mml and Mm2 in the programs with names ending in "r" and "m" set the limit to the number 
of subdivisions in the first and second stages of subdivision, respectively. The maximum 
value of Mml is 14 and of Mm2 is 15, except in the simplex4 programs, where the respective 
limits are 13 and 14, and also except for recursive subdivision, where the maximum value 
of Mm2 is one less than the corresponding value for symmetric subdivision. In all cases, 
the built-in subdivision limits prevent the program from dividing 1 by 2, giving an integer 
arithmetic answer of 0. As before, the effective upper limit will be set by machine memory 
and machine execution speed. In using the "r" and "m" programs, and setting Mm2 = 1, the 
maximum value of Mml that will not exceed memory is Mml = LMAX—1, with LMAX the 
corresponding maximum determined as above for the single stage program. Once LMAX 
is determined, one can take any value 1 < Mml < LMAX without exceeding memory. 
Because of the staging, the numerical output depends only on the sum Mml + Mm2, that 
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is, one is free to redistribute the computational effort between the first and second stages. 

10. The parameters for the higher order integration routines are contained in the main program 
files in subprograms with names beginning with "setparam" . They are given in array con- 
structors, and have been preset to values indicated. The program variable names have been 
chosen to roughly correspond to the symbol names in the formulas of Sees. VII and VIII. 
For example, for the direct cube routines, where A is a free parameter, it is called aalamb; 
in the order 7 routine for simplex integration, A^ and \\ are the respective elements of the 
array constructors alambl (blambl) and alamb2 (blamb2) corresponding to the first (second) 
choice of parameter values. (In the fifth order cube and simplex routines, where only one 
pair of array constructors is needed, they are called alamb (blamb), even though the corre- 
sponding quantity is labeled X\ in Sees. VIIC and IXB.) These presets can be changed by 
the user to give a different sampling of the integrand in the integration subregions, subject to 
the following rules: (i) The inequalities in the comment statements must be obeyed, to keep 
the sampling points inside the subregion, as discussed in Sees. V and VI. (ii) The parame- 
ters in each array constructor must have non-degenerate values, so that the corresponding 
Vandermonde equations will be solvable. If two parameters in an array constructor are very 
close, solution of the Vandermonde system will have large truncation errors, so care should 
be taken to keep the parameters in each array constructor reasonably well spaced, (iii) The 
"a" and "b" array constructors should have different parameter values, since these are used 
to give the two different integrand evaluations used in the error estimate. 



C. Outputs (and their use in making memory and running time estimates) 

Program outputs (except in the MPI case) are written to a file "outdat.txt" and also appear 
on the screen. In the MPI case, outputs are written to the output file specified by the system for 
a "print" statement. A brief description of output labeling follows: 

1. All programs write out the user-set values of ip, Mm (or lliml and Uim2), eps, ithinlev, 
ithinfun, i subdivision, and iaccuracy. They do not print out the values of the parameters 
in the array constructors in the subprograms setparam. 

2. In all programs, outa and outb give two evaluations of the integral divided by the base 
region volume, corresponding respectively to the two different samplings of the integrand set 
by the "a" and "b" parameters in the array constructors, and \outdiff\ gives the difference 
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\outa — outb\. The size of \outdiff\ gives an estimate of the likely error in the answer; this 
estimate can be improved by evaluating the integral with a number of different choices of the 
array constructor parameters, and also by comparing the evaluations obtained using different 
program options as set by the user-set inputs. As noted in Sec. XI, to get the value of the 
integral without normalization by the base region volume, one must multiply outa and outb 
by the base region volume Vq. For a standard simplex, Vo = for a side 1 hypercube, 

Vq = 1, while for a half-side 1 hypercube, Vq = 2 P . 

3. In all programs, err sum gives the sum of the absolute values of the local subinterval thinning 

tests, 

errsum = y(subregion)|/(/ a (subregion, If, (subregion)) | , (174) 

subregions 

with f(A, B) the thinning function. As explained above, for the choice ithinfun = 1 this 
gives an upper bound for \outdiff\, and when I a (subregion — 1^ (subregion) has uniform sign 
over all subregions, errsum = \ outdiff \ . However, when signs are not uniform over sub- 
regions, errsum for ithinfun=l can be much larger than the actual error, as in the two 
Gaussian example discussed below. 

4. In all programs, I gives the level number, ind gives the number of subregions carried forward 
to the next level, indmax gives the maximum value of ind encountered over the course 
of the various levels that have been executed, fcncalls gives the number of function calls, 
t-current gives the current elapsed time in seconds at the various levels of the first stage, 
and t- final gives the total elapsed execution time in seconds. In the approximation in 
which the geometric series summing the number of function calls over the various levels is 
approximated by its largest term, corresponding to the highest level attained, and when 
there is no thinning, fcncalls ~ y2P(M»"*-i) for the single stage program, and fcncalls ~ 
rp2p(ihmi+ihm2-i) f or " r " anc [ " m " programs, with T the appropriate function call value 
from Table VI or VIII. (in the absence of thinning, the exact formula summing the geometric 
series is fcncalls = T[2 p( - K+ V - l]/[2 p - 1], with K = llim - 1 for the single stage program 
and K = lliml + llim2 — 1 for the "r" and "m" programs.) When there is thinning, this 
gives an upper bound on the number of function calls. 

5. In the "recirculating" programs with main program name ending in "r", t_restart gives the 
time at which the second stage is initiated, in which the subregions carried forward from the 
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first stage are subdivided sequentially. During the second stage, the program will indicate 
approximately when it is 10, 20, 90, 100 percent finished in sequentially processing the 
subregions carried forward from the first stage, by printing this information to the screen 
(but not by writing it to file). In interactive mode, this permits one to gauge how long the 
calculation will take to finish; if it looks like the calculation will take longer than one wishes 
to wait, one can stop execution and restart with different, more tractable, parameter values. 
Since these numbers are computed by integer division, the actual numbers may be 9,19,... or 
other similar strings, depending on the residue modulo ten of the number of regions carried 
forward. One can also estimate the total running time by multiplying t_ restart by the 
number of subregions ind carried forward to the second stage from the final level of the first 
stage, further multiplied by 2 p ( lhm2 ~ lhm1 ^ to correct for a difference in the first and second 
stage level numbers. (Similarly, for the single stage programs, from t-current and ind at 
the output of any level I, the maximum running time, in the absence of thinning, to reach 
the level limit lliml is the product i_ current times ind, further multiplied by e p ( lliml ~ l \)We 
generally found that timing values, on a laptop, varied by one or two tenths of a second 
between identical runs, so estimates of total running time become reliable only when one 
has proceeded to the point where several seconds have elapsed. The final statistics include 
indcount, which gives a sum of the ind values at each level of the second stage, and which 
indicates the ind value that would be needed if the second stage subdivisions were carried 
out in the first stage by using a larger lliml value. Because of the staging strategy, the 
maximum ind value that is required is the much smaller number indmax. 

6. In the MPI programs with main program name ending in "m", i_ restart is the time at the 
end of the first stage when the subregions carried forward, numbering indstart in total, are 
distributed to multiple processes, and fcncalls gives the number of function calls up to this 
point. If t-restart does not appear in the output, the program has completed execution 
before entering the second stage. Since MPI programs are typically run in batch mode, no 
intermediate statistics are output during the second stage, but one can make a rough estimate 
of total second stage running time by multiplying t_restart by the number of subregions 
indstart carried forward, further multiplied by 2 p ^ lhm2 ~ lhm1 ^ to correct for a difference in 
the first and second stage level numbers, and dividing by N pioccss — 1 (process serves only 
as an accumulation register for the output of the remaining N proccss — 1 processes). If this 
estimate is too large, one can stop execution and restart with less ambitious parameters. 



71 



The final statistics include indmaxprocess, which is the maximum of the final indmax over 
all of the processes. 

XIV. SOME SAMPLE RESULTS, AND OPEN QUESTIONS 

A. Sample results 

We turn now to some sample results which illustrate the capabilities of our numerical integration 
programs. Our first example is one given in the paper on VEGAS of Lepage (1978), consisting 
of the sum of two spherically symmetric Gaussians equally spaced along the diagonal of a cubical 
integration volume, 

Ip = -( -4T I" dP X [e-ZU(*(i)-mVa 2 +e -E?. 1 Wi)-2/3)Va 2 ] > (175) 

2 \mrV 2 J Jo 

with a = 0.1. In this form I p can be evaluated by the "cubetile" programs which tile a unit 
hypercube with Kuhn simplexes. In order to apply the direct hypercube "cube" programs, we 
make the change of variable x = (1 + y)/2 to rewrite I p as an integral over a half-side 1 hypercube, 

I p = --( * Y t ^[ e -£Li(^)+i/3)7(4a 2 ) + e-SUWH/s) 4 /^)] . (176) 
' 2 2P yair 1 ' 2 ) J_\ 

In his paper Lepage compares numerical evaluations of I p for various p with a target value of 
unity, which is accurate enough for his purposes. However, the programs given here are capable 
of much higher accuracy with current computers, so we will need a high accuracy evaluation of I p 
for comparison purposes. We can get this by noting that the two Gaussians contribute equally to 
I p (to see this, set x — > —x in Eq. ()176p ) , and each individual Gaussian is the pth power of a one 
dimensional integral J, giving 

T — TP 

J ^ dye-^W 2 /^ . 

2a7rV2 J_ 1 

(177) 

The one-dimensional integral J can be evaluated in terms of error functions or complementary 
error functions, 

J=i[erf(l/(3a))+erf(2/(3o))] , 
=l-I[erfc(l/(3a))+erfc(2/(3a))] , 

(178) 
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TABLE X: Evaluation of J to 13 place accuracy using 3rd, 5th, 7th order cube routines 

iaccuracy \outdiff\ fcncalls t_ final 
3 1CT 16 0.2 x 10 5 < .02s 

5 10~ 15 0.5 x 10 5 < .02s 

7 10~ 13 0.9 x 10 5 < .02s 

TABLE XI: Evaluation of powers of J to give expected values of I p to 13 place accuracy 

P I P = J p 

1 0.9999987857663 

2 0.9999975715341 

3 0.9999963573033 

4 0.9999951430740 

5 0.9999939288462 

6 0.9999927146199 

7 0.9999915003951 

8 0.9999902861717 

9 0.9999890719498 

but it can also be evaluated numerically to 13 digit accuracy by running the "cube" program to 
a depth of twelve total levels. Running the "r" version of the programs with parameter values 
ip = 1, lliml = 5, lliml = 7, ithinlev = 12 (no thinning, which makes the results independent of 
eps and ithinfun), and iaccuracy = 3, 5, 7, we get from all three runs the result 

J = 0.9999987857663 . (179) 

The statistics for running time (on a MacBook Pro) and the number of function calls for these 
runs are given in Table X. Running with iaccuracy = 1 gave only 10 place accuracy with 12 
levels, but gave 13 place accuracy when 20 levels were used (which took about a second, rather 
than hundredths of a second). Thus, for this calculation iaccuracy = 3 is the most cost-effective 
program. 

Running a program to raise J to powers then gives the expected results for I p given in Table 
XI, with an uncertainty of 1 in the final decimal place. 

We give in Table XII results for dimensions p = 2, 3, 4, and 5 as obtained from the "r" version 
of the programs on a laptop, and in Table XIII for p = 7, 9 as obtained by running the "m" version 
on a 64 process cluster. (Laptop runs were done on a MacBook Pro and an older Dell Inspiron, and 
for the latter, the timings were rescaled by a factor 0.49 to give timings for a MacBook Pro. We 
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made cluster runs with 128 or 64 processes, and for the former, we rescaled the running time to that 
for 64 processes. We invite the reader to compare the running times and accuracies summarized 
in Tables XII and XIII with those that can be obtained from other integration programs.) For 
all of these runs, where thinning was used, we took ithinfun = 1. "Place accuracy" indicates 
the decimal place where differences first appear from the 13 place result in Table XI. To within 
an order of magnitude, this agrees with the difference |outdiff | between the two evaluations of the 
integral given by the program. The values of errsum for these integrals (not shown) were typically 
one to three orders of magnitude larger than both joutdiff | and the actual error, indicating that 
the local subinterval errors do not all have the same sign. Since |outdiff| can be smaller than the 
actual error, for an unknown integral it cannot be taken as giving the error; in this case the best 
way to estimate the error is to run the program with different choices of program options and to 
use the spread of results to estimate the error. 

From Tables XII and XIII, we see that a minimum of 5 levels is needed to get good accuracy for 
the double Gaussian example. With 5 levels, the smallest hypercube side is 1/32 = 0.03125, small 
enough to resolve the double Gaussian characteristic scale of 0.1 in good detail. On the other hand, 
with only 4 levels, the minimum side is 1/16 = 0.0625, making it harder to resolve a scale of 0.1 and 
limiting the accuracy to 4 significant figures. Most of the cluster runs were done without thinning, 
and thus should characterize the accuracy attainable for any function on a unit hypercube with 
a characteristic scale length of 0.1. For serial runs done with thinning (Table XII), the reduction 
in running time was proportional to the reduction in number of function calls, and ranged from a 
saving in the range 30% to a factor of 3.5, for values of e which yield the same or one place less 
accuracy as when there is no thinning. For parallel cluster runs done with thinning (Table XIII), 
the reduction in running time is considerably less than the reduction in number of function calls. 
This arises from the fact that even though the program initially distributes subregions to processes 
using a shuffling routine that assigns adjacent subregions in the stack to different processes, some 
processes get subregions (like ones near the Gaussian peak) that are "hard" and so take longer to 
finish, as compared with processes that get "easy" , readily thinned regions near the Gaussian tails. 
Consequently, since the final time t- final records the time when all processes have finished, it is 
not reduced by thinning in proportion to the number of function calls. 

From Tables XII and XIII, we see that thinning with a value of e equal to the error level in the 
runs with no thinning leads, in the double Gaussian examples, to a reduction in accuracy. This 
reflects the fact that in the double Gaussian case, errsum is typically 2 to 3 orders of magnitude 
larger than |outdiff|, indicating that the local errors are not of constant sign, and also errsum is 2 
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to 3 orders of magnitude larger than e, indicating that the local thinning condition is not satisfied 
in all subregions. When local subregion errors alternate in sign and the thinning condition is not 
uniformly obeyed, there can be cancelations of errors in the output integrals, leading to improved 
accuracy and a smaller |outdiff|, but thinning can then reduce the degree of cancelation and 
reduce the accuracy. Finally, we note that the 6 level run with iaccuracy = 9 did not give as 
many significant figures as the corresponding run with iaccuracy = 7; we believe this is due to 
the increased truncation errors associated with running the ninth order routine. The cluster which 
we used was more than an order of magnitude slower in running quadruple precision (real(16)) as 
opposed to double precision (real(8)) code, so it was not feasible for us to investigate this further 
by repeating the ninth order 6 level run in quadruple precision. (We did, however, test in quadruple 
precision that the ninth order routines integrate polynomials of ninth degree or lower to within 
expected truncation errors.) 

We also studied the double Gaussian example using the "cubetile" programs. These integrate 
over a unit hypercube by integrating, over a single Kuhn simplex, the symmetrized function that 
sums over corresponding points of a Kuhn tiling of the hypercube. Results for this study are 
given in Table XIV. Because of the p ! symmetrization factor in the number of function calls, the 
"cubetile" programs take longer to run than the "cube" programs for a corresponding number 
of subdivision levels. Because tiling of a hypercube with Kuhn simplexes does not reduce the 
subregion side length, the p ! increase in number of subregions does not compensate for insufficient 
resolution when the attainable level number is not large enough. This can be seen from the results 
in Table XIV. For p = 5, where 5 levels can be run on the cluster in reasonable time, significantly 
better results are obtained from a 5 level "cubetile" run than are obtained from a 5 level "cube" 
run, at the price of a factor of 120 more function calls. However, for p = 7 it was not possible to 
do a 5 level calculation in reasonable cluster running time, so we had to settle for 4 levels, which 
as we saw above has insufficient resolution to give very high accuracy for the double Gaussian test 
problem. The "cubetile" results in this case are better than the 4 level "cube" results, reflecting 
the factor of 5040 more function calls, but the 6 place accuracy achieved is not as good as what 
can be achieved, in less running time, by using the "cube" program with 5 levels. We conclude 
that the "cubetile" programs become significantly less useful as the dimension p increases, because 
of the p ! symmetrization factor necessitated by Kuhn tiling. 

Our next two examples illustrate results obtained from the "simplex" programs to evaluate 
integrals over a standard simplex. For our first example, we consider an integral based on the 
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TABLE XII: Double Gaussian results using the "cube" program for ip = p =2 ,3 ,4 ,5 ,7 (timings for 
MacBook Pro); levels = lliml + Uim2 and "place accuracy" compares to Table XI 



ip {accuracy 


levels 


ithinlev 


eps 


t- final 


fcncalls 


(outa + outb) /2 


\outdiff\ 


place accuracy 


2 


3 


10 


no thinning 






1.2s 


0.31 x 10 7 


0.9999975715340 0.4 x 10~ 12 


13 


2 


5 


10 
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3.4s 


0.94 x 10 7 


0.9999975715339 0.3 x 10~ 14 
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5 
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2 
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13 
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0.9999975715340 
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2 
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10 
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2 
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13 


3 


3 


7 
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0.999996358 
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9 


3 


3 


9 


2 


10^ 


13 


72s 


0.18 x 10 9 


0.999996357305 


0.9 x IO" 11 


12 


3 


5 


7 


2 


10" 


-9 


2.8s 
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0.999996356 


0.3 x 10- 10 


9 


3 


5 


7 


2 


10~ 


13 


4.1s 


0.11 x 10 8 


0.99999635730 


0.2 x 10- 10 


11 


3 


5 


9 


2 


10~ 


13 


190s 


0.48 x 10 9 


0.9999963573032 0.3 x 10~ 13 


13 


3 


7 


7 


2 


lo- 


12 


13s 


0.32 x 10 8 


0.9999963573033 0.5 x 10~ 12 


13 


4 


3 


7 


2 


ur 


-9 


55s 


0.12 x 10 9 


0.999995143 


0.3 x IO" 8 


9 


4 


5 


7 


2 


io- 


13 


310s 


0.71 x 10 9 


0.99999514305 


0.4 x 10- 10 


11 


4 


7 


5 


2 


10" 


-7 


3.7s 


0.83 x 10 7 


0.99999510 


0.3 x IO" 8 


8 


4 


7 


6 


2 
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13 


90s 


0.20 x 10 9 


0.99999514308 


0.2 x 10- 10 


11 


5 


3 


5 


2 
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13 


6.4s 


0.14 x 10 8 


0.9999940 


0.7 x IO" 6 


7 


5 


3 


G 


2 
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13 


170s 


0.37 x 10 9 


0.99999394 


0.5 x 10- 7 


8 


5 


5 


5 


no thinning 






49s 


0.10 x 10 9 


0.99999386 


0.2 x IO" 6 


7 


■5 


■5 


■5 


2 
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■13 


29s 
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0.99999386 
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7 


■5 


7 





no thinning 






9/1 He 
Z4US 


U.OU X 1U 


0.99999393 


0.1 x IO" 7 


Q 
O 


5 


7 


G 


2 


10" 


-9 


1700s 


0.34 x 10 10 


0.999993926 


0.3 x 10- 10 


9 


7 


3 


5 


no thinning 






6100s 


0.78 x 10 10 


0.999992 


0.9 x IO" 6 


6 


7 


3 


5 


2 


10" 


-9 


840s 


0.11 x 10 10 


0.999991 


0.9 x IO" 6 


6 


7 


■5 


5 


2 


10" 


-9 


3500s 


0.62 x 10 10 


0.999991 


0.2 x IO" 6 


6 


7 


7 


4 
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1200s 


0.21 x 10 10 


0.9997 


0.4 x IO" 2 


3 


7 


7 


4 


2 


10" 


-9 


340s 


0.60 x 10 9 


0.9995 


0.4 x IO" 2 


3 



Feynman-Schwinger formula of Eq. (j32[) . with Dq = 1 and D\ = ... = D p = a, 

^P~ Pl /tandard simplex [1 + (a - 1) (x(l) + ... + x( P ))]P+ l ' ^ 

Taking a = 0.1 (and for p = 5, also a = 0.01) gives an integral that is sharply peaked on the 
diagonal hyperplane 1 = x(l) + ... + x{p) bounding the simplex. Results for this integral obtained 
from the "m" version of the program, with symmetric subdivision and no thinning on a 64 process 
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TABLE XIII: Double Gaussian results using the "cube" program for ip = p =7, 9 from a 64 process cluster; 
levels = lliml + Uim2 and "place accuracy" compares to Table XI 
ip iaccuracy levels ithinlev eps t_ final fcncalls (outa + outb) /2 \outdiff\ place accuracy 
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10 
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-6 
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7 


7 


7 


■5 


no 


thinning 
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0.27 x 


10 12 


0.99999150 


0.3 x 
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-7 


8 


7 


1 


G 


no 
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0.52 x 
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0.999993 
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-6 


G 


7 


3 


G 


no 
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0.9999915 


0.8 x 
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-7 


7 


7 


5 


G 


no 
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-8 


8 


7 


7 


G 


no 
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98000s 


0.35 x 


10 14 


0.9999915003 


0.1 x 


10" 


-9 


10 


7 


7 


G 




2 10- 10 


37000s 


0.37 x 


10 13 


0.99999148 


0.1 x 


io- 


-9 


8 


7 


9 


5 


no 


thinning 


4000s 


0.14 x 


10 13 


0.99999144 


0.6 x 


io- 


-7 


8 


7 


9 


G 


no 


thinning 


510000s 


0.18 x 


10 15 


0.99999148 


0.1 x 


10" 


-7 


8 



9 3 5 no thinning 7500s 0.25 x 10 13 0.999989 0.1 x 10~ 5 6 

9 5 5 no thinning 49000s 0.17 x 10 14 0.9999888 0.8 x 10~ 6 7 

9 7 5 no thinning 380000s 0.13 x 10 15 0.99998907 0.7 x 10~ 7 8 

9 7 5 2 10~ 8 100000 s 0.52 x 10 13 0.9999885 0.7 x 10~ 7 7 



TABLE XIV: Double Gaussian results using the "cubetile" program with symmetric subdivision, for ip = 
p =5, 7 from a 64 process cluster; levels = lliml + lliral and "place accuracy" compares to Table XI 



ip iaccuracy 


levels 


ithinlev 


eps t_ final 


fcncalls 


(outa + outb) / '2 | outdiff \ 


place accuracy 


5 7 


5 


no thinning 


73s 


120 x 0.27 x 10 9 


0.999993928884 0.7 x 10" 11 


12 


7 7 


4 


no thinning 


12000s 


5040 x 0.93 x 10 9 


0.999988 0.1 x 10~ 5 


6 



cluster, are given in Table XV. "Place accuracy" indicates the decimal place where differences first 
appear from the exact answer a~ p , and this correlates well with the difference |outdiff| between 
the two evaluations of the integral given by the program. The values of errsum for these integrals 
(not shown) were nearly identical to |outdiff|. 

As our second simplex example, we consider the polynomial integral (c.f. Eq. (|169p ) 

2 p+1 r P 

7 T7 ^ T7 r=p!/ [1 -x(l) - ... - x(p)] 2 T\ x(i) 2 , (181) 

(3p + 2)(3p+l)...(p + 2)(p + l) ^ Atandard simplex 1 ^ t\ 

which is strongly suppressed at all the vertices of the simplex. Running a program to evaluate the 
exact answer for this integral on the left hand side of Eq. (|18ip . and then evaluating the integral 
on the right on a laptop using the "simplex" programs, gives the results in Table XVI. 



77 



TABLE XV: Feynman-Schwinger integral for ip = p = 5, 7, 9 from a 64 process cluster, with no thinning; 
levels = lliml + lliml and "place accuracy" compares to the exact answer a~ p 



ip 


a 


iaccuracy 


levels 


i_ final 


fcncalls 


(outa + outb)/2 


\outdiff\ 


place accuracy 


5 


0.1 


7 


5 


0.39s 


0.27 x 10 9 


0.99998 x 10 5 


0.3 


5 


5 


0.01 


7 


9 


52000s 


0.22 x 10 14 


0.999998 x 10 10 


0.3 x 10 4 


6 


7 


0.1 


7 


5 


160s 


0.12 x 10 12 


0.99995 x 10 7 


0.2 x 10 3 


5 


7 


0.1 


7 


6 


20000s 


0.15 x 10 14 


0.9999995 x 10 7 


2. 


7 


9 


0.1 


7 


5 


56000s 


0.48 x 10 14 


0.9999 x 10 9 


0.3 x 10 5 


4 



TABLE XVI: Polynomial integral for ip = p = 4, 5 (timings for a MacBook Pro); levels = lliml + Uim2 



and "place accuracy" compares to the exact answer on the left hand side of Eq. () 18 1(1 



ip iaccuracy 


levels 


t_ final 


fcncalls 


(outa + outb)/2 


exact answer 




\outdiff\ 


place accuracy 


4 


5 


6 


8.7s 


0.63 x 10 8 


0.88095326 x 10~ 8 


0.8809532619056 x 10 


-8 


0.1 x 10~ 17 


8 


4 


5 


8 


2200s 


0.16 x 10 11 


0.880953261905 x 10~ 8 


0.8809532619056 x 10 


-8 


0.3 x 10~ 21 


12 


5 


7 


5 


39s 


0.27 x 10 9 


0.21591990 x 10~ 10 


0.2159199171337 x 10~ 


■10 


0.2 x 10~ 18 


8 


5 


7 


6 


1300s 


0.86 x 10 10 


0.2159199171 x 10~ 10 


0.2159199171337 x 10~ 


■10 


0.7 x 10~ 21 


10 



As our final example, we consider the 1 dimensional singular integral 

f 1 1 

/ dx = vr/2 ~ 1.57079633 . (182) 

Jo vl-x 2 

In Table XVII we give results for this integral using the "r" version of the "cube" program, with 
p = ip = 1, lliml = 14 and Uim2 = 15 (that is, using the maximum allowed number of levels), 
iaccuracy = 5, ithinlev = (that is, thinning starts at the outset), and eps = 10 -10 , as a function 
of the choice of thinning function ithinfun. We see that in this case, ithinfun = 3 gives the 
fastest evaluation, with ithinfun = 2 next fastest and ithinfun = 1 the slowest. This differs from 
the double Gaussian examples, where ithinfun = 1 gives better results than either ithinfun = 2 
or ithinfun = 3. 



B. Programming extensions and open questions 

There are a number of possible extensions of the programs that could be pursued in the future. 
(1) The MPI version of the programs could be rewritten to include redistribution of the process 
workload after each level £ of the second stage. This would make the reduction in running time 
when using thinning track more closely with the reduction in the number of function calls. (2) The 
multistage strategy could be extended to a third (or more) stages, by not harvesting the subregions 
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TABLE XVII: Evaluation of a one dimensional singular integral with different thinning functions. Running 
times were all less than 0.1s; "place accuracy" compares to the exact answer 



ithinfun 


fcncalls 


[outa + outb)/2 


\outdiff\ 


place accuracy 


1 


0.24 x 10 6 


1.570778 


0.35 x 10~ 5 


5 


2 


0.13 x 10 5 


1.570778 


0.35 x 10~ 5 


5 


3 


0.45 x 10 4 


1.570777 


0.43 x 10~ 5 


5 



that fail to obey the thinning condition at the end of the second stage, but instead writing them 
to a memory device which is then read sequentially by a third stage, etc. (3) One could build 
in an option of permuting the simplex vertices at the start of the simplex programs, which gives 
a different subdivision, and therefore a different evaluation of the integral for use in estimating 
errors. (4) Finally, we remark that the same subdivision, thinning, and staging strategies that we 
have used will apply with any integration formulas that give two different estimates of the answer 
from each subregion, not just the parameterized moment fitting formulas that we developed in 
Sees. VII and IX. 

There are also a number of mathematical questions that we have left open. (1) We found 
numerical evidence that symmetric subdivision of a standard simplex obeys the bound of Eq. (|46p 
for reduction of side length, and that after I symmetric (recursive) subdivisions, the resulting 
subsimplexes each fit within a hypercube of side 1/2^ (l/2 f '~ 1 ). We do not have a proof of these 
conjectures, but have assumed them true in constructing the programs. (2) Given the regularities 
in the construction of parameterized fifth, seventh, and ninth order integration formulas for the 
simplex and hypercube cases, it would be of interest to try to find a general all-orders rule for 
these. (3) We have not addressed the question of analytic error estimates for the parameterized 
integration formulas. (4) We have not addressed in any systematic way the question of deciding 
which thinning function is optimal for a given choice of integrand. (5) It would be of interest 
to study the systematics, in the moment fitting method, of the tradeoff between the number of 
parameters that are fixed by appropriate conditions, and the number of function calls. 
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XVII. CONTENTS OF PROGRAMS IN DIRECTORIES 

Each file listed in this summary contains multiple programs, each of which begins with comment 
lines describing its function. 

• The simplex programs take as base region the standard simplex of Eq. (|3Up . (When used as 
part of the cubetile programs, the base region is the Kuhn simplex of Eq. (|37p .) 

• The cubetile programs take as base region the side 1 hypercube of Eq. (|39|) , 

• The cube programs take as base region the half-side 1 (i.e., side 2) hypercube of Eq. (|4"T|) . 

• The numbers after simplex or cube indicate the integration orders that are included. 

A. Directory simplexl23 

This directory contains: 

• The subprogram file simplexsubsl23.for. 

• Main program files simplexmainl23.for, cubetilemainl23.for. 

• Recirculating main program files simplexmainl23r.for, cubetilemainl23r.for. 

• MPI parallel main program files simplexmainl23m.for, cubetilemainl23m.for. 

B. Directory simplex4 

This directory contains: 

• The subprogram file simplexsubs4.for. 

• Main program files simplexmain4.for, cubetilemain4.for. 

• Recirculating main program files simplexmain4r.for, cubetilemain4r.for. 

• MPI parallel main program files simplexmain4m.for, cubetilemain4m.for. 
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C. Directory simplex579 

This directory contains: 

• The subprogram file simplexsubs579.for. 

• Main program files simplexmain579.for, cubetilemain579.for. 

• Recirculating main program files simplexmain579r.for, cubetilemain579r.for. 

• MPI parallel main program files simplexmain579m.for, cubetilemain579m.for. 

D. Directory simplex579_16 

This directory contains: 

• The subprogram file simplexsubs579_16.for. 

• Main program files simplexmain579_16.for, cubetilemain579_16.for. 

• Recirculating main program files simplexmain579_16r.for, cubetilemain579_16r.for. 

• MPI parallel main program files simplexmain579_16m.for, cubetilemain579_16m.for. 

E. Directory cubel3 

This directory contains: 

• The subprogram file cubesubsl3.for. 

• Main program file cubemainl3.for. 

• Recirculating main program file cubemainl3r.for. 

• MPI parallel main program file cubemainl3m.for. 

F. Directory cube579 

This directory contains: 

• The subprogram file cubesubs579.for. 
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• Main program file cubemain579.for. 

• Recirculating main program file cubemain579r.for. 

• MPI parallel main program file cubemain579m.for. 

G. Directory cube579_16 

This directory contains: 

• The subprogram file cubesubs579_16.for. 

• Main program file cubemain579_16.for. 

• Recirculating main program file cubemain579_16r.for. 

• MPI parallel main program file cubemain579_16m.for. 



